Linus on GIT and SCM
An anonymous reader sends us to a blog posting (with the YouTube video embedded) about Linus Torvalds' talk at Google a few weeks back. Linus talked about developing GIT, the source control system used by the Linux kernel developers, and exhibited his characteristic strong opinions on subjects around SCM, by which he means "Source Code Management." SCM is a subject that coders are either passionate about or bored by. Linus appears to be in the former camp. Here is his take on Subversion: "Subversion has been the most pointless project ever started... Subversion used to say, 'CVS done right.' With that slogan there is nowhere you can go. There is no way to do CVS right."
And we should trust your unsupported assertion more than the educated opinion of the originator and head of one of the most successful large software projects in history because...?
Well Linus didn't have anything bad to say about MS Source Safe. . .
;-)
[ducking] Sorry, I couldn't resist the urge.
Wow, that's the worst idea I've ever heard
I hope you're working for one of my company's competitors, if you are so eager to hamstring your developers and limit their productivity! Having to wait for someone else to finish a major piece of development before I can fix a bug in an unrelated section of a file they happen to be modifying... yeah, that's the way to turbocharge your development process.
CVS and Subversion are open source projects, Linus should fix them.
anybody have a good tutorial? (not the crappy one which comes with it)
I'm not an SCM rube either. I've competently used tla (arch), darcs, and of course CVS. but git just seems too hard to use. damn fast though.
He is only human. Just because he is the head of a huge software project doesn't make him infallible.
Just look at the whole 'RMS vs Linus' thing.
His opinions should carry some weight, especially since he should know more than anyone what the limitations of SCM software is when it comes to larger projects like the linux kernel. But a lot of SCM comes down to the way a project is managed, the preferences of the people involved, and how they deal with their project. I doubt there is a blanket solution... a 'one SCM package to rule them all' so to speak.
Especially in the software industry you can always find someone just as good as yourself that strongly holds opinions that are the polar opposite of yours.
We ALL know that the people who use CVS and SVN are version control Nazis!
I've used CVS, SVN, and GIT in serious projects and I can say I far prefer SVN to GIT, and GIT to CVS. GIT was incredibly confusing to use, and it may just have been the way the repository was administered was poor, but I never knew if I was synched with everyone else's checkouts and the command names made no sense. Its been over a year so I don't remember the details of GIT, but I remember having to do a lot of things "twice". Need to do a checkout? Two commands. Need to commit? Two commands. It was a bitch to use and I am glad I'm done with it. SVN, on the other hand, I felt very comfortable with from the start and most important of all, I trusted SVN to do what I wanted it to and to keep me from screwing up. In a year of using it, it has failed to lose my trust.
I'm not trying to say SVN is better than GIT. The best repository depends on the type of project and type of development. But defaming SVN in favor of GIT is not, I believe, a valid statement. Especially when (I'm pretty certain) many, many more projects use SVN rather than choosing to use GIT.
Hero of Allacrost, a FOSS RPG for *NIX/*BSD/OS X/Win
No one said that if you're famous and contributed something incredible to the world (such as Linux) you can't speak out of your ass most of the time, just because you enjoy how everybody listen and try to decipher if they should care about it, or just laugh and pass by.
I use SVN if a medium sized team and see SVN used extensively in all kinds of projects around the globe with great success. I personally love the workflow of SVN.
The only thing that they need to work is merging of branches, and incidentally I've talked to the developers, they're quite aware of this flaw of SVN and working on it. We'll see new versions that can track changes in each branch and even attempt automated merges with good success.
I know a guy who has the same personality like Linus. The guy is very smart, he single-handedly is coding an application which is very popular in its area (won't mention it since that's internal stuff). He keeps bitching all the time: about customer feature request, about random products and how sucky they are, how people can't see that. And he could also change his opinion overnight for no apparent reason and go in the other extreme. But he's a friggin' programming genius and what he does is great, despite is takes a lot of effort to deal with him.
Well, probably those two go together: being an amazing creator, and being an amazing ass with huge ego. Who knows.
... And that is that CVS/SVN are centralized, while GIT is distributed, like GNU Arch.
There are appropriate uses to both of these, and in kernel development I think it makes sense to have distributed development. However, in smaller projects, which really *need* a very specific direction (example, Wesnoth, I would think would not have gotten where it is today if there were so many branches where people were all making their own art).
Linus is enough of a famed leader that he's going to be listened to, and thus kind of pulls the community around him as a central source of development. That's not necessarily going to happen everywhere.
http://mediagoblin.org/
You missed the point of the thread; to discuss git, not to be one.
I personally like how git has excellent Microsoft Windows support... It makes it a great tool for use with Altium PCB design software because of the handy svn->git compatibility tools that git has for windows. It allows all people in the enterprise to use git, regardless of the platform that they use.
;-)
Most definitely bettern than SVN, right?
--jeffk++
ipv6 is my vpn
if your working with a good SCM and have somebody with a clue who administers it (I've worked in a large clearcase setup for years, with a great admin staff) concurrent development isn't that hard to do. Good tools make the job easy.
file locking is ok for 2 or 3 developers, any more then that, it sucks bad.
PHP is the solution of choice for relaying mysql errors to web users.
Cvs is already done right. These would-be improvements are pointless.
My favorite, of course, is Mercurial. My main draw is that I had been interested in distributed SCMs for years, but had never found one that made any sense to me whatsoever. I was on the hunt again and stumbled on Mercurial, and I've been hooked ever since.
Of the various distributed SCMs, Mercurial is the easiest to use one I've found. And it's pretty fast, though not quite as fast as git (though I have some ideas on how to fix that). And since it's written in Python with only a very small C component it runs on many platforms.
Need a Python, C++, Unix, Linux develop
I took a look at git a while ago and was completely underwhelmed. The UI was so bad it was useless, and it didn't "seem" to do anything that Darcs didn't do. (I used to love Darcs because of the automatic patch dependency computations).
.git dir and shell scripts that combine very simple low-level functions. For instance, you can create a branch just by saving the SHA1 ID of the tip into a file in .git. You can branch off any point in the history this way, including branches you've deleted in the past (git keeps all the old commit objects by default, even ones that aren't pointed to by any branch or tag.. this is very simple and understandable model, like reference-counting in a way).
Now that all the "next generation" SCM tools have matured somewhat, I took a look at all of them again. I had to stop using Darcs because of the "patch of death" problem, which basically is this: after using Darcs on a project with long-lived parallel branches, the repository may eventually enter a wedged state you can't get out of, due to exponentially complex patch dependencies. Oops.
At this point I had an idea of what an SCM should do, how it should work, what the "mental model" should be. I want to create changesets, add them to branches, combine multiple branches (and keep track of renames and so forth between branches), re-order changesets, collapse multiple changesets into one, discard old branches, etc.
Of course, CVS and close cousin Subversion are SO UTTERLY USELESS I didn't even consider them. Seriously, Subversion is like gold-plated shit. Looks nice but it's still shit. Reading people say stuff like "Subversion is awesome" makes me wince. How can something that doesn't have "real" branches, and doesn't have tags OF ANY KIND, be useful for anything? How do you keep track of multiple merges between branches? Answer: you don't. Or you keep track of revision numbers using svnmerge and pray it all works. Even the Subversion docs sortof hand-wave this away. I.e., they hand-wave away one of the FUNDAMENTAL ASPECTS of source code management: branching and merging. It's like hearing people talk about OO databases. They mean well but they just don't comprehend the generality of the underlying problem.
That's why I was so excited about Darcs: the author "gets it". Unfortunately the implementation is flawed.
I checked out a few more (Mercurial, bzr) but finally settled on git because it let me do all the things I needed to do, and it did them FAST. Once I figured out the underlying model I was pretty impressed. Git can be viewed at many levels: very low-level plumbing, or UI-level, or in between. The UI and documentation is still pretty shitty, but thankfully they are working on improving it and are moving away from the idea of having interchangeable UIs. Just focus on improving "core git".
One great thing about git is that so much of it is just files in the
The other great thing about git is how easy it is to sling changes around and reorder them and combine them. For instance let's say you add a file to your project as commit "A". Then you add some code that uses this file as commit "B". Then you fix a bug in the file as commit "C". So you have A-B-C. Now you'd like to combine A and C into a single patch A', and put B on top of it, like this: A'-B. In git, this is super-easy. I can think of two ways to do it off the top of my head.
I was checking into a CVS project the other day (for a client) and wanted to do this. Then I realized, you can't move things around in CVS like this *twitch*. So nowdays I do everything in git and only after the changes are beautiful and self-contained and well-commented do I check them into CVS one at a time.
Okay so they point is, check out git (or honestly? Checkout out ANYTHING that isn't CVS or svn). Even if you think Linus is an asshole (which he is) or you don't like the git UI (it's not that bad now), check it out anyway.
And if you don't use SCM at all? You suck. Start learning. It's a best practice that you can't live without, once you start.
The thing is, you've got the wrong solution to the problem. Rather than not allowing branches, you need to control when and how often they're made, and how long they're allowed to survive. Your fixing a policy problem with technology, which never works well. If the branches are kept under control, you don't have the last-second merge problem. Merges should be happening constantly throughout the process so everyone stays in sync. If someone isn't committing their work at least once a day, that's when they get a stern talking to from the lead developer. Because if a developer needs to coordinate with another developer to change one line of code, then you've wasted two people's time instead of one.
Linus's position as Linus kernel project leader may make him knowledgable about what makes a good source control system for Linux's distributed development model, but if he thinks that one size fits all for source control then he's definitely talking out of his ass and out of his area of expertise.
You might want to check out TortoiseSVN if you're using svn on windows. It makes version control really easy, and you don't even have to touch the command line.
Anthropic principle: We see the universe the way it is because if it were different we would not be here to see it.
The ultimate reason why Linus dislikes SVN, CVS, etc. is that it is centralized. Everyone checks out source from a central server and commits their changes to the same centralized area. This has problems: your workspace is not versioned. By this I mean, you cannot track local changes to your workspace without committing them to the central server.
A common pattern in development is to try one approach, test it, tweak it, and possibly try another approach if the first did not work out, perhaps reverting to a prior approach. With decentralized version control, you can commit your changes to a local repository and work from there. All the locally changes you make are versioned, and be committed, checked out, examined all without contacting a central repository. This is ideal, because you often want to try various options to find the one that works best, before pushing your changes to the rest of the world. In centralized version control, you can use a branch for this purpose, but often branches in these systems are difficult to either create, merge, or maintain, so they are rarely used. The end result is that with centralized version control, developers version their workspace in their head. DVCS systems remove the mental burden.
Fortunately, FOSS developers are realizing the usefulness of DVCS and major projects are converting to some form of DVCS. Mozilla is switching to Mercurial. The Pidgin project, which just released 2.0.1, is using Monotone. (Linus favorably mentioned both of these distributed version control systems in his Git talk, as they are both are distributed).
Once you accept that DVCS is better than the centralized model (which may not be true for some situations), only a few (but growing number of) version control systems are viable. This is currently a hot area in open source development, with software such as GNU Arch, Monotone, Mercurial, Git, Darcs, Bazaar, and more paving the way. Many open source DVCS's are still in development and not ready for general usage. I can't speak for Mercurial, but Monotone doesn't have the greatest performance, instead preferring integrity over speed. This led Linus to write git, since speed is very crucial for a large project like the Linux kernel.
Whatever the actual program (git, Mercuial, or Monotone), more and more open source developers are realizing the advantages that distributed version control can offer. I encourage all developers that haven't used any DVCS to try it -- once you do, you won't go back.
Tired of free ipod spam sigs? Opt ou
No need, I'm sure anyone with half a brain and an ounce of self-respect would have run screaming from such a fabulous work environment long before you had the chance to demonstrate your blindingly self-evident superiority to them.
Linus talks about his distributed model, how everyone has a branch, and how this avoids politics associated with who gets commit access. He claims (and I admit I've seen this happen in some) that many projects have quite the internal politicking on who has CVS commit access. But then he claims that Git's special sauce eliminates these internal politics. Ok, I was intrigued, so I listened on.
Essentially, he explains, the secret with Git is that everyone has commit access on their own branch - they do whatever they want. He says that the way it works is that someone does something cool with their own branch, then they start hollering to say "Hey, I have a good branch, merge mine" and it will get merged. Politics over.
Ok, so now I'm scratching my head. How is this a fundamentally different paradigm? In CVS, basically anyone can check out the whole tree and make any changes the like. They can then say, see, my changes are good and ask for them to get committed or ask for commit access themself. In Git, this commit access bottleneck is just moved from the commit stage to the merge stage. You make your changes, commit them to your separate and unique branch, and then ask someone with to merge it, or give you the ability to merge it in to mainstream. How exactly does this eliminate the politics? You are still going to have some people with "the power" and some people without. In any project where you have people who are going to fight about who gets commit access, you'll just have a fight about who has the ability to merge into mainstream.
So, ok, distributed is nice (though for some projects central may be preferred) but I don't see how this magic system bypasses politics. In fact, I can potentially see more internal politics over this method. I can see factions gathering to support this or that branch, arguing about which is better, fighting about which one gets merged in. I can see the potential for branches going longer between merges, and more changes happening at once, making it harder to track problems. I don't claim these scenarios are more likely, but I do claim that this changing from a commit access to a merge access paradigm is just renaming the problem.
If you have a project that has thousands of developers all of the world like Linux does, a SCM system that is focused on merging makes a lot of sense. Unfortunately, there is a tendency for some people to overdo merging on small projects when they don't really need to. If the application is designed in a modular fashion and developers are assigned specific modules, than merging is rarely needed. Of course, many control freaks don't like this approach because it makes it harder for them to "correct" other developer's code.
I use SVN on windows, mac os x, linux (ubuntu, debian, fedora) as well as netbsd. TortoiseSVN works great on windows especially for the point and click style users who need to use SCM. SvnX works great on Mac OS X. Altium PCB designer works great with the svn command line tools and shows graphical diffs of our circuit boards. But for some reason, Tortoise SVN and svn.exe are unable to access a GIT repositiory.
In addition, git works well for simple projects but not so well for projects that have many different related subprojects which share code.
For instance, our SVN repository holds everything needed for an entire product, including embedded linux with busybox, initrd and custom software and libraries - as well as DSP source code for two different add on cards, the GUI for mac, windows, and linux, the docutils xml file for the various manuals, and manufacturing and test code.
I'd love to use git once it attains the required maturity level so that I can do what I need with it.
--jeffk++
ipv6 is my vpn
You are the biggest ass I've seen on slashdot. And I've seen a lot of asses on slashdot. But bragging about how much money you make and the vacations you take, as if somehow that means your opinion is correct....wow.
Smart to post as a coward tho, gotta give you that.
You hit the nail on the head. Distributed version control often comes with superior merging, making the process less painful and encouraging it to occur frequently. Monotone employs a 3-way merge, Codeville has an innovative merging algorithm, and some may even support 5-way merging ("left's immediate ancestor, left, merged, right, right's immediate ancestor") in the future.
In my experience, nearly all merges occur automatically and cleanly. Only if two developers modified code in conflicting areas of the source code do you have to merge manually--and even then, only one person has to do it. It is much better to have merging operate automatically and transparently when possible, than to have to have two people manually coordinate each and every one of their changes beforehand.
Tired of free ipod spam sigs? Opt ou
So what you are saying is that RCS was done right and everything done since is wrong...
Excuse me, but please get off my Pennisetum Clandestinum, eh!
I wrote about Linus's talk a few weeks ago:
b uted/
http://kylecordes.com/2007/05/17/linux-git-distri
Looking back at that, and at your comment, some things come to mind:
* the tool Linus is pushing, greatly facilitates the idea of frequent, easy merges, and Linus mentions that a tool with great, fast merges, helps you merge early and often.
* on the other hand, your comment is about "you need to control when and how often [branches] are made...", while a big point of distributed SC tools is the opposite of that control: these tools make the power of the tool fully available to all users. A "main" repository may (and probably should) have permissions/hooks set to enforce some policy about what happens to what branches. Individual users can always create local quasi-branches by simply not checking things in; with a tool like they can can create real (local) branches too, which can then be promoted to official status (i.e. on a blessed central repository) if needed.
It sounds like you really mean, "Doing big merges at the last minute sucks." But it's more generally the case that doing anything major at the last minute sucks. Merging two weeks before ship date is no different than checking in a big new feature or a substantial refactoring or a fix for a bug in a critical system component two weeks before ship date. If you're doing any of those things, it's a failure of your project management, but that failure has absolutely nothing whatsoever to do with merging per se.
Merging or checking in a major new feature six months before your ship date is not any problem at all, and your methodology throws away the advantages of branch-based development during the bulk of the project's lifetime for the sake of a minor additional level of safety (which can be avoided with proper project management anyway) at the very end. If your problem is that "merges happen at the last minute" then honestly you need to look at your development processes as a whole.
Even at the last minute, though, not all merges are major operations. In a development shop that embraces git's "branch often" philosophy, merges tend to be small and frequent, and thus no one of them has much of a chance of breaking the source base. Further, before you merge a feature-development branch into your main source base, you will almost always have done the opposite first, and done a full round of testing at that point. If you do that, the merge is almost a no-op.
You may be saying this in the context of large FOSS projects, but for most projects, not allowing all the team members to commit changes seems like a really bad idea. If you don't trust them, why are they on your team?
Complaining about the occasional inefficiencies of file locking while forcing some developers to waste time waiting for permission to commit, seems really ironic to me.
So don't do it
Wow! I bet you have never worked on anything other than hobby
projects.
Most projects I have worked on cannot do without branching &
branching big & I am not talking about branches created for
individual devs.
What do you do if you have make patches on an earlier release(s)?
What do you do if your project team has 50 devs working on
5 different modules inside? If one guy makes a buggy submit
it will break every one else? Typically each team does weekly
sanity tests & then propagates the changes to the main.
Yeah - and I agree with Linus - CVS is rubbish.
Have used CVS, Clearcase & Source Depot. Source Depot
is a Microsoft internal Source Control system. Microsoft
licensed Perforce & developed on it. I used to work with
MS long back & Source Depot was the best Source Control
System I have ever used.
CVS lacks too many features.
1) Atomic checkins/submits
I am trying to submit changes in 5 files as a single bugfix.
A submit/checkin should either succeed for all 5 or fail for all 5.
CVS doesn't do this. The end result is that I may end up submitting
a change in the header without submitting a correspond change in the
implementation file.
2) Changelists
After checking in multiples files together, at any point in time, I should
be able to find out all the changes that were checked in at the same time.
CVS has no way of doing this - Submitting 5 files together is the same as
submitting 5 files separately as far as CVS is concerned.
3) More Changelist features for non-submitted changes
Let us say I am working on 3 different bugfixes. Source Depot allows me
group together my changes in different changelists even before I
submit the changes. That is I can create changelist A B & C.
In changelist A - I have files a.c & a1.c changed, in changelist
B, I have b.c & b1.c changed & so on. So I decide I am done with
all the changes required in the subset A, I can submit it very easily
or undo all changes in changelist B.
4) Merges
Merges between branches are a breeze with Source Depot. With CVS it's
a pain. Source Depot stores a lot of information about merges which have
already happened which in invaluable. In CVS, merges between branches
are very little more than changes manually copied from one branch to
another.
I can do a lot of stuff which I can't do with CVS
- I can very trivally merge Bugfix 1111 (comprising of 5 files
checked into changelist XXXX) from a branch to another branch or
the main trunk.
- Because Source Depot stores information about merges, I can do periodic
single command merges very easily between a branch & the trunk - Source Depot
will not try to merge in changes which have already been merged the last
time I did a merge.
I could go on & on, but the point is that something Source Depot makes
a developers life so much more easier. I could work around all these
things in CVS (i.e. do it in multiple steps) but the ease is something
worth paying for I think. If Microsoft ever released Source Depot
as a commercial product, it would be great, but I don't suppose their
license with Perforce would allow it.
The Wise adapts himself to the world. The Fool adapts the world to himself. Therefore, all progress depends on the Fool.
Why? It doesn't have to be. At least if you use something that isn't horribly broken.
Yes, they will. Because this is a monumentally stupid idea. Because the entire *purpose* of revision control systems (note: "CVS" stands for "Concurrent Versioning System") is to make it possible for developers to work on things at the same time. The idea is that you can get more benefit from the concurreny than you get difficulties from merging.
Rules like "merge early, merge often", perhaps? Fixes the problem, and *doesn't* cripple development horribly like your idea would.
There's one trick to getting performance from monotone, which is to flip a switch on your workspace to make it use timestamps (like SVN does) instead of always re-hashing every file to see if it's different. For small projects, the rehash is best since it is certain. With timestamps on unix if you make changes in 1 sec, for example copying a different version right after a update (which can happen btw) then version control will not check in your changes and they can be lost.
Once you enable timestamps with monotone pretty much all operations are faster than subversion. Even reverting can be faster in practice because the server typically has the files in ram vs your workstation which has to seek all over the place to make copies. Depending on your setup of course.
Monotone is not slow anymore, and it keeps a much tidier and smaller repository. So small that in just a little more space than SVN's spare copies of all HEAD files for the past revision you can have all revision on your workstation. Why anybody would use subversion is beyond me... Linus is right on this one.
Every developer has their own repository, which they can commit as branches into a repository of repositories :P
The perfect sig is a lot like silence, only louder
Distributed version control the way git does it (conceptually, not necessarily the implementation) is the best idea in SCM since concurrent development and optimistic merge conflict resolution on check-in.
Notice how, even years after better ideas superceded the lock-modify-unlock paradigm, many tools and shops still use exclusive-lock SCM.
It could be quite a while before you see anything like the way git does SCM in use in the majority of programming shops.
We used VSS for a long time but switched to SVN after reading numerous accounts that that VSS would eventually croak, and because we needed multiple developers working on the same source. After all the developers (~16 very active users, 20+ projects) were over the learning curve of SVN, there's not one that would go back to VSS. I can honestly say there is nothing I know of that we need that SVN doesn't do for us. We use TortioseSVN for the Windows guys, and the server-side guys (Linux/Unix) use command-line SVN. We have no need to branch local copies. We very rarely have to manually resolve conflicts. I fail to see what GIT would do for us.
Perhaps SVN sucks for kernel-guys. But for what we do, SVN fits the bill perfectly... Central repository, easy to get up-to-date, easy to commit, easy to update, easy to review changes, easy to review history....
SVN for us is the right tool for the job.
.
So, how do you take that diff, revert half of it back to server's version, begin coding a completely new direction, realize you were right the first time, go back to the original dif you took, then pull in half the stuff you did while doing the wrong thing, finish coding and push the commit back to the server?
You can't because subversion has no client side version control.
Have you found one that does handle subprojects well? I use darcs for most everything nowadays, but I'm always open to new ideas.
Exactly. With a centralized version control system (PVCS, which is not coincidentially listed as the riskiest bet on the Forrester Source Code Management comparison) I've used in the past at a large company, everyone ended up making several different local copies of the code with various changes, in order to revert if necessary. I was dumbfounded - isn't that what version control is for, to keep track of changes?
Tired of free ipod spam sigs? Opt ou
That doesn't have to do with centralized/distributed. ClearCase is super-centralized, but you do your experiments in branches, as many as you like.
Monotone's inode prints (which, incidentially, Linus was a major contributor of) can speed up some things, but the initial pull of a large repository is still unacceptably slow. The Pidgin developers have worked around this performance bottleneck by supplying bzip2'd Monotone databases via http, which the developer then can sync with the latest repository on pidgin.im to obtain an up-to-date database with the latest changes. Partial pulls should partially fix this problem in a future release of Monotone, or so I hear.
For what it's worth, I use Monotone daily and find the performance acceptable. For the record, Linus used Monotone at a particularly bad time it its development cycle, when it was very slow and the main designer was on vacation. Nonetheless, the Monotone developers emphasize correctness and integrity over speed, and Mercurial and Git were direct responses to the performance of Monotone. Still, the performance of Monotone is always improving.
Tired of free ipod spam sigs? Opt ou
For a large projects with lots of developers who work via the Internet your suggestions just don't make good sense.
If I start a project and twenty people eventually join in and they check out various parts of the project but don't check them back in for long periods of time, following your suggestions would be horrid.
Linus is right. You want distributed repositories not a centralized one.
The race isn't always to the swift... but that's the way to bet!
What does "same code" have to do with SAME FILES? There are plenty of opportunities for developers to edit different parts of the same file without stepping on each other's toes. On the other hand, someone changing a header file to remove a function can wreck major havoc on people who just wrote the code that relies on it.
Surely we all agree that, if semantically-aware source control tools are developed, it's a good idea to give developers exclusive locks on, say, basic blocks that they are actually modifying or function prototypes for functions they are calling in those blocks.
atomic checkins?
'cvs mv'?
'cvs cp'?
And that's without even exerting 3 brain cells.
"Avoid employing unlucky people - throw half of the pile of CVs in the bin without reading them." -- David Brent
Git has some great features. Speed, that the whole repository with revision history is mirrored, that it's consistent cryptographically, etc.
There is one part that I don't get and it's the decentralized part. Yeah, it is a big bonus that potentially any copy can take over would something happen to the main one or that developers can create branches and share code with each other without relying on a central server, but the part that bugs me is that according to Linus the right model is when there is a maintainer like him that avaits emails either sending him patches or giving him git repository addresses and telling him to pull. For most projects this is simply an unbelievably stupid idea, waiting for a person to judge your patches one by one. Most open source software on a small to medium level don't work this way.
Also, there is the fundamental misunderstanding that decentralized means that there is no central server/primary copy. This is patently false even in the case of Linux. Linus' tree is the central server. For 95% of the people THAT tree is the linux kernel. For 4.9999% it is the 'real' linux kernel. For the remaining 0.00001% or less, well those are the forks.
It's quite simple. There is a decentralized environment, but there exists a main/most influential copy. If you diverge too much from that main copy, that's a fork.
So I was saying that for small to medium scale projects pretending that there is no centralized server, just people's repository is stupid. For large scale projects it can work, like it does for Linux, but then you have a dedicated core team that is necessary in judging what goes in and what stays out. It doesn't matter if you call it people x's tree or commit right to the central repository. That is the same thing. The terminology Linus uses is annoying because it lies. Not Linus, but the terminology.
It takes a man to suffer ignorance and smile
Be yourself no matter what they say
So don't branch, and DON'T allow concurrent checkout of any code - FORCE the DEVELOPERS who need to work on the same code to COORDINATE their work EARLY in the development cycle. Of course they'll bitch.
That's very naive. At a minimum, you will (eventually) need:
a) main trunk (2.0 development)
b) emergency production patch (1.0.1)
c) QA Patch (1.1) 1.1 is in QA for a week or two. People are working on 2.0. You will need to branch to fix the odd bug in 1.1 that QA finds.
You will need decent branching and merging tools to get the bug fixes from 1.0.1 and 1.1 into 2.0. It's also amusing when someone finds a bug in 2.0 that needs to get back ported to 1.0.1 and 1.1. Agile/continuous integration only reduces the chance of needing to branch and merge, it doesn't completely eliminate it. The price for trying to figure out a branching/merging plan at the last minute is very high. And don't forget that the change control system (the "red tape") also needs to support branching.
Yeah, but he's really mean!
GIT and SCM? You mean Geita, Tanzania, and Scammon Bay, Alaska, USA?
What sound do people on rollercoasters make? Hint: it's not Xbox 360.
Richard Dawkins spent a good deal of time in his book, "The Blind Watchmaker" talking about what the gradualist and the punctuationist view of Darwinism is. His gripe was that the latter was sold as a whole new theory, opposing the old gradualist view. Dawkins was rightly pissed about this, because the latter is merely an improved version of the former. I feel the same about the Centralized vs. Distributed topic. The distributed system is basically a centralized system where EVERY COPY HAS FULL REVISION HISTORY.
There is still a central or main copy, otherwise you'd be herding a lot of slowly diverging forks! Most projects want to produce a release eventually and there is a main copy of sourcecode which the release is produced from.
Imo, the reason Linus dislikes SVN and CVS and pretty much everything else is because of speed, because most SCMs lack the ability to work with merging different copies of repositories and work on a commit level instead, and do not allow for easy development routing around the central copy.
It takes a man to suffer ignorance and smile
Be yourself no matter what they say
I think you just described git - I can do all of that in git.
Faster and with less faff than using *any* other version control system I've tried. However, MKS Source has some really nice features... shame it's so expensive, really!
Pirate Party UK
You cannot write anything in C using just its about 5 commands/statements.
You need to use some C system (or other) libraries and these are the compatibility problem of C.
It is about those POSIX, SystemV, ANSI, BSD, GNU etc. etc. standards conflicting each other, with incomplete specifications etc. Also you face architectural problems of different word size, endianity etc. (this remains true even for high level languages regarding binary network protocols).
The thing is, Linux is actually a pretty small project. Much larger projects would include FreeBSD, which uses CVS not only for the kernel but for every line of source of the entire OS. Now, Linus is a smart guy, but I don't know why he thinks CVS (and SVN by extension) won't work for large projects. It clearly can. It may not be suitable for the way he wants to run his project, but that's a different issue.
Dewey, what part of this looks like authorities should be involved?
Depends what you're doing in C.
Threads? Doing anything with the OS? Uh oh.
Hello World in C runs on more platforms than Hello World in Python, but Python abstracts a lot of less trivial stuff so it works cross platform without rewriting.
I haven't worked with GIT so far, but i did watch the talk from Linus and some stuff he mentioned sounded like very familiar problems. Mainly when he said that even patch is better than SVN. And that's where i got the feeling that those tools solve two different problems:
SVN is better than doing normal backups for sourcecodes.
GIT seems better than working with patch.
I agree with your statements in general. I have projects at work that use VSS, and some in CVS / SVN. VSS sucks. However, you mention your solution is web based - so I assume it isn't C++ (which I assume the legacy system is, since you mention .h). I would wager the fact that you are using a more modern language and tools probably helped out a wee bit.
Again, not to discount your point - VSS sucks, hard - but I don't think we can take the discrepancy in time as purely due to the difference in version control.
Amusingly though, both Git and Mercurial were "inspired" by Monotone, but were created as separate projects because the developers wanted to go in different directions
Yeah and luckily the whole "haves versus have nots" on who gets CVS commit access rights has never, ever, been a problem in *BSD or XFree86. Right?
Seriously, centralized version control fails for large open source projects for political reasons, not technical ones. That's really Linus' main point, although his lack of tact in presentation is going to cause many people to miss that insight. With a changeset-based distributed version control system, you only have to trust patches and code, not people. The whole concept of "the chosen few who get commit access" goes away, and problems like the XFree86/X.org fork or the EGCS/GCC semi-fork disappear.
I was at the talk and I have to say he lost a HUGE amount of respect from me (and other people in the room whose job has to do with source control).
The way git works as a decentralized solution with a chain of trust is simply not useable for really large, multiple projects with interdependencies. And it's even worse when you need to control access to certain portions of the code.
I see Git as a pyramid scheme with Linus sitting on top. I can't start imagining the job of the poor release engineer in a big corp who would need to merge the changes of sub-engineers and the chain of trust involved to reach the top ! What I see is that everyone would code and test on out of sync code, a bit like Vista's development was.
Git is a solution that is fine tuned to Linus specific needs, but it's ages away from a solution that's flexible for most of the industry's needs.
I'm a big fan of subversion, and while I'll admit it's far from perfect it's way better than cvs could ever be. It does the job well most of the time, and SVK is filling some of the holes.
Personally, I just let the development tools manage my local workspace. They generally do a better job anyway, since they know what you are doing to it. Eclipse is my favorite tool because of this - it has local file history (with versions you can inspect, compare, and revert to) as well as undo history for all of the refactorings you have been doing (Java only, so far ...).
This paired with SVN or CVS solves all my problems with local workspace revisioning.
Darcs is arguably easier to use, although it doesn't scale well to large projects like Mercurial can. In particular, Mercurial requires commits at odd times (pull + merge) and doesn't support the same level of cherry-picking that darcs can support. Anything you could imagine wanting to do is likely possible in darcs; The main problem this raises is that supporting that rich model makes efficient implementation difficult or potentially impossible. However, for small projects <100k LOC, this is not really an issue, so darcs is hard to beat.
I'm sure it has been a big problem, but e.g. in KDE (which is also quite large), getting commit access isn't exactly hard. And anyway, SVK provides decentralized versioning backed by a central repository, so SVN doesn't preclude this.
Seriously, centralized version control fails for large open source projects for political reasons, not technical ones.I agree completely, if it fails at all.
That's really Linus' main point, although his lack of tact in presentation is going to cause many people to miss that insight.His lack of tact is legendary. But his lack of tact is what makes people read his every comment, and I hardly believe I am alone in smiling when he spews out those one-liners.
With a changeset-based distributed version control system, you only have to trust patches and code, not people. The whole concept of "the chosen few who get commit access" goes away, and problems like the XFree86/X.org fork or the EGCS/GCC semi-fork disappear.Disappear? Hardly, the patches still need to be accepted. But decentralized repositories are wonderful to work with, and as such removes a technical hindrance.
Each man differs in his opinion of versioning software. For fun, here is mine: (only those I've tried)
Religion is regarded by the common people as true, by the wise as false, and by rulers as useful.
So Use SVK, which uses the base libraries of Subversion (the atomic, versioning filesystem ones which are heavily tested and work very well) and uses them to build a distributed SCM.
http://en.wikipedia.org/wiki/SVK
I am NaN
Subversion can be quite useful for a project's "authorative" repository. Especially if that project used to successfully use CVS, as a great many small projects do/did - and some larger ones, like GNOME and KDE, too. Subversion is also quite convenient for publishing sources, though it's less than ideal for any contributors without commit trying to work from anonsvn.
svn is supported by a number of IDE plugins & GUIs, which a surprisingly large number of people use and come to rely on. I'm not one of them, but many of the folks I work with use various svn guis.
git-svn looks very interesting, as it should provide a way to add distributed scm capabilities on top of svn, where you're working with projects that use svn. It'd be useful even just for the ability to take partial local history and keep local modifications under revision control. I wonder if there's anything similar for Mercurial...
What bothers me most about svn is the insufficient integrity guarantee on the repository. That, however, can be fixed, and I hope it's going to be addressed with an `fsfs2' format. Frankly, not everyone *needs* distributed SCM, and many are quite fine with a good centralized system.
Sorry but this is ridiculous. I'm on a project where there are 5 or 6 independent and major features each with business and back-end dependencies which may or may not be ready in time for a particular point release. All need to be QA tested and therefore need to be checked in for release engineering builds. Branches and merging are the only sane way to manage it. I'm sure some projects have hundreds of features and subsystems in concurrent development. It would be impossible to feature work on the same branch without severely disrupting development and QA testing. Saying otherwise is living in cloud cuckoo land.
Your method does work, but it is not the best method. coordinated incremental merging of sub-projects throughout the entire development life-cycle also works, assuming you have a few developers who are detail oriented enough to merge code reliably. Also one problem with merging is that most shops don't code-review merges, which is just nonsense in my opinion.
Branching often and avoiding double-commits seems to be the way to enable parallel development. (double commits as in committing the same change to two branches, you should merge before you get to that point)
Also just because developers bitch doesn't mean they are wrong. Developers tend to whine when they think they have some pointless burden placed on them that is preventing them from doing their job. (their idea of what their job is and the management's idea of their job rarely match up perfectly). Developers often promise schedules based on the assumption that they won't have to merge or suffer a "code-freeze" or be blocked by arbitrary rules used to beat developers over the head when management does not think they are coordinating.
the more often you merge the less work it is. when you merge branches constantly the changes end up being trivial and you don't have to stress and fuss over a bunch of conflicting changes. it helps if you can hold a quick meeting with the two or three people that produced a merge conflict, so this technique really only works if your team is all in the same building/campus.
I view development as a dance, we all step on each other's toes for a while but once you get used to your dance partners you get better at it. it just takes a long time with developers because they are antisocial and not hugely team oriented. Developers, in general, hate depending on other people to do their job. They also are quick to blame others for failing to take them into account (another antisocial behavior is assuming everyone around you is a mindreader, another word is egocentric)
it's not that developers are prima donnas, which is something software managers often assume. it's that some personality types don't realize that there is actually anyone else around them. often developers have no idea that any of their peers even do any work. they might as well be equivalent to furniture, something to avoid tripping over in the hallway, but otherwise irrelevant.
“Common sense is not so common.” — Voltaire
Never trust an argument from authority unless it has the proper reasoning to back it up. Besides, Linus is managing a highly distributed, Linux-only (by definition) project. Git might suck badly if you have cross-platform requirements, or a centralized repository, a different development model, or simply want UI tools & IDE integration.
It would be interesting to know more details of what you were trying to do, to see if there is some non-ACL way of mapping it to distributed VCS functionality. From a distributed VCS background, I would probably do something like the following:
(1) Split off non-development files into a separate repository, with different permissions to that tree
(2) Give release engineers their own tree which developers cannot push to; If a release engineer needs a fix from a developer, he can pull it.
In fact, a whole lot of nice things fall in to place when you make pull the fundamental operation rather than push. The general workflow is for a developer to finish an implementation, checking in as necessary, and then notify an upstream or more central developer that the patch is complete. The upstream developer reviews the patch and pulls it if is correct (works, doesn't violate policy, etc). In this flow, code review is emphasized, and at no point is any developer trusted with "push" rights until you get to the final central integrator (usually a release/QC person).
I would like to watch the video, but it seems impossible without flash. Anybody got a link to download the video file?
Wouldn't it be safe to say that more platforms support C than Python?
You have no idea what you are talking about, do you? Any platform that supports C also supports Python, unless you count really tiny ones that do not have enough memory.
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
Or the "secret sauce" for Linus's work is actually soap and water?
Amen to that; saying that an SCM is a "software configuration manager", when you are using it to manage source code (and not software configuration), has always struck me as incredibly silly~
I mod down anyone who says "I will be modded down for this", regardless of the rest of their comment
http://www.youtube.com/watch?v=4XpnKHJAok8
This is the video from the article. You can either watch it in the tiny embedded window, or you can go to youtube and click the button to watch it full-screen.
Look, posters: if you're going to point to a video that's hosted on YouTube (or another video hosting site), just link to that site. Don't link to some random web page that has the video embedded in it.
--
Change is certain; progress is not obligatory.
Agreed. I've seen critical patches held up for up to a year because the "QA" process was not allowed to test and approve individual components for release, but had to hold up or roll back the entire software release if any individual component failed. There are uses for monolithic development models, but for large projects it's quite crippling. Developers get frustrated and lose the ability to share their work with others. Managers who don't always have the technical expertise needed wind up being gatekeepers on both development and communications.
I'll be delighted to test out git and see if it works well, simply based on Linus's very strong recommendation. The branching model is particularly attractive for live redundant repositories.
On medium to large projects -- 10 to 100 developers maybe, on 1 to 5 sites, with a significant amount of metaconfiguration which is itself versioned -- it is simply impossible to forbid concurrent editing of code.
It's also extremely hard to avoid branching. To say 'there is no branching' is to say 'nobody has any changes that need to be in version control, and yet should not be forced on all developers / all releases'. It's only possible to say this about extremely small projects -- small enough to have only one stream of development going at once.
I'm stating the obvious here but it's worth repeating because some people do have a lot of trouble understanding that proper SCM (as opposed to the parent post's conception of SCM) is necessary.
Whence? Hence. Whither? Thither.
I thought it was to discuss what Linus said.
Personally, I think he's on crack, going on and on about the "Great Pumpkin."
GLaDOS for President 2016! "Well here we are again. It's always such a pleasure." -- GLaDOS, 2011
Either that or darcs, although darcs has a major drawback in not being able to be shortened to three letters well.
Please, for the good of Humanity, vote Obama.
That might be a good idea as a part of a political protest(e. g. the guy on the Rage Against the Machine album cover whose name escapes me), but that guy's ideas on merging wouldn't even be useful as that!
Please, for the good of Humanity, vote Obama.
Linus has enough credentials and competence to give his opinion some serious weight. And even if he does run roughshod on SVN and CVS quite a bit, he is right in a lot of points.
I've started using SCM on a regulary basis with Subversion for about half a year now and more often than once have I thought: 'This can't be the way SCM is done right'. Only one thing I know of that SVN does better than CVS is pure entire-project version numbers. CVS seemed like a kiddiebike with training weels that wanted to be as complicated as Kornshell. SVN did away with that but still has downsides that we needn't put up with for any reason. SCM metadata stored in each project directory is one of those little things that bug the hell out of me for instance. And when Linus says that his brain isn't damaged into thinking CVS is an OK way of doing things because he only used it when forced to - I believe him. I wasn't aware that his project Git had come so far so fast until two weeks ago, but when he says that any SCM that he can do better in two weeks of coding on his own isn't worthwhile - and SVN is one of those - that really has me thinking. I'm not so much into software developement with large groups that I could exactly tell how bad SVN is, but I'm sure Linus can.
To use an analogy: CVS was a Ford Model T, SVN is an improved Ford Model T, but I think we should start looking for a current BMW or something. It might be a good idea to do that *before* we move a larger audience to using SCM.
We suffer more in our imagination than in reality. - Seneca
Ignoring Linus' heinous unprofessional attitude, massive ego, and completely insulting comments, there's a lesson to be learned here: you and your team need to decide whether you want centralized or decentralized version control. There are advantages and disadvantages to both methodologies. Anybody who gets up on a stage and tells you that "all centralized systems are garbage, decentralized is the one true way" isn't giving you the full picture. (And likewise, anyone who says the opposite is equally off their rocker!) 80% of software development takes place within corporations, and there's a reason centralized SCM has worked so well in that environment. Decentralized systems might be great for certain open source communities, but it's not what most organizations want or need. If you'd like another viewpoint on why centralized might sometimes be better than decentralized (even in open source projects), take a look at this essay I wrote a while back.
I'm one of the original designers/developers of Subversion, and even we (in the svn developer community) are well aware of both sides of the coin. We're seriously considering adding decentralized features to svn 2.0. We've also added true merge-tracking magic to the imminent svn 1.5 release (so svn is no longer "hand waving" merges, they'll be just as simple as in decentralized systems.)
If you truly believe that distributed SCM is the the Only Way of working in all situations, then I suggest you try to push these systems on corporate teams, and see how they fare. Distributed systems have a model that's much more complex for the average joe-user to understand, and as a result most existing distributed systems have extremely complicated UI's. If they're complex enough to confuse open source nerds, think about the rest of the world's programmers...
Keep an open mind about this stuff. No matter what Linus says, there's no magic SCM bullet.
And the Linux kernel probably has more people working on it than FreeBSD.
Please, for the good of Humanity, vote Obama.
I heard he sleeps with nunchucks.
Please, for the good of Humanity, vote Obama.
What you're citing as advantages for decentralized version control are not the result of decentralization.
Cheap branches? Subversion has cheap branches.
Better merging? This is a result of algorithms has nothing to do with whether the system is centralized or not.
If you're on a fast net with the server you can commit as often as you like. If you can branch/merge easily it's no problem.
If you want to cite advantages for decentralized version control it might be more like:
If you have to talk to a server over slow links, decentralized is much better
Linus' speech was really about git, which is just as free as SVN and CVS.
Please, for the good of Humanity, vote Obama.
Wow, who are your clients - I'll work cheaper and get the job done just as well. That's some mighty powerful snake oil you are selling if you've convinced folks you are worth $400/hr.
My approach? In practice, we don't branch much at all, there is just too much risk when merging a long lived branch. I do however let people work on the same piece of code at the same time. I encourage frequent synchronization and commits.
To force folks to wait until someone else is done with a particular source file is just not practical or productive. If people are properly tasked, and not working on the same exact same thing, it's rare that they will both need to edit the same file, and when they do, they are usually working on different parts of it.
But when you get down to it, projects don't go over time and over budget because of misuse of source control. They go over time and over budget because of mismanagement. You can't expect all developers to be good at coordinating their work with others. That's the job of the project manager.
He emphasizes engineering over ego. It's a litmus test. If you've got the sack to get in the game, you better not mind seeing your work demolished. This is an anti-prima-donna vaccine for any organization.
So long as he remains consistent, even-handed, and not ad-hominem, it's OK.
Get thee glass eyes, and, like a scurvy politician, seem to see things thou dost not.--King Lear
The manpage discusses the name.
Religion is regarded by the common people as true, by the wise as false, and by rulers as useful.
Personally I made 3 false-moves to svn from cvs before managing to actually move to svn for real.
Some months ago my svn repository died and thankfully i have backups.
Having said all that, i always felt SVN was a move in the wrong direction for the right reasons. Considering though that I use svn for only my own work (with only me coding in it), git kinda seems pointless for me (though in development with more than 2 people involved i cant certainly see where it would kick butt over svn). For me, i was thinking of getting rid of svn and putting my repo's onto an ext3cow fs!.
I find it a little odd actually that people cant see why per-coder branching with a merge model is a big win-win. To me this is like a big "WOW, we can do QA is a logical manner finally" (keep in mind, a merge can be another branch if i understand it correctly). Of course, I haven't really used git myself other than a quick play to see what its like.
On a side note, 2 companies I've worked with have black-listed SVN cause of its ability to be configured in an authentication mode involving plain-text. It didn't matter that no one there had planned on using that particular functionality, just that it even existed.
Don't get me wrong, a lot of people (myself included) would love to use Darcs. It's done a lot of thing right.
We can't.
Darcs is just too slow. You don't need a huge project to reveal this slowness, just a medium-sized one. It's merge algorithm freaks out when two patches appear at the same time that are identical (which happens a lot more than you might thing). It's also too freakin' slow! I'm using git until Darcs can un-slow-ify. Then I'll go back to it (unless, of course, git can simulate what darcs does, in particular interactive pulls).
Slashdot. It's Not For Common Sense
Git works perfectly well with a centralized repository. This use case is fully supported and actually has some unique support in the git-hooks. However, its discouraged socially.
I am not sure what kind of model git, darcs, mercuirial, bazaar, or monotone couldn't satisfy. They all scale from one person to many (albeit speed concerns with some). They can be used in a variety of ways. The only really annoying part I've ever found is the lack of support for empty directories, but that's what .anchor files are for, I guess.
Git actually comes stock with some of the best UI tools I've ever had included with my version control system. Seriously. Go take a look at gitk and check out git-gui.
As for IDE integration, a google search will show you that's already falling into place, and it took me all of 2 days to get work to play nice with TextMate (my editor of choice).
Slashdot. It's Not For Common Sense
$ find /usr/src -type f | xargs cat | wc -l
13193911
Now, I'm tracking -RELEASE and not -HEAD, so maybe they removed half the code for 7.0, but the 6.2 codebase is significantly larger than the Linux kernel.
Dewey, what part of this looks like authorities should be involved?
Since you seem to be an informed proponent of git, maybe you can answer the one question I wish someone would have asked Linus.
Say you have a team of, as few as six, programmers coming in for work on Monday. They've all made changes to their codebase (over the weekend, they're dedicated). How do they all manage to get each others changes, and begin working with a completely up to date version? Do they pick someone to act as the centralized repository for that day?
Both are bloated monsters. EDLIN is all you should ever need!
The Tao of math: The numbers you can count are not the real numbers.
The SVN devs have known from the beginning that the lack of merge tracking constitutes, to use their verbiage, a "headache". For 5 long years, SVN's "Best Practices" solution was to track merges manually in the commit log messages. This "Best Practice" could be best described as, to use the technical terminology, "really fucking error-prone."
Look, I like SVN, I use SVN, I hope they get merge tracking (and 'svn obliterate', as long as I'm creating my Christmas List) ASAP. My only point here is that the great-grandparent's claim, that "It's trivial to branch and merge in SVN", is a heaping, stinking load of crap, to use the technical term. You know it, I know it, CollabNet knows it, everybody knows it.
I'm guessing that even sqlrob (173498) knows it.
"Avoid employing unlucky people - throw half of the pile of CVs in the bin without reading them." -- David Brent
Linus suffers from a common misconception: if something doesn't work the way he wants it to, he assumes that it's no good. And if he adds a feature that he finds useful, he can't understand why other people might object.
Fortunately, Linus's opinions on version control systems don't matter: there are lots of version control systems to choose from, and users just choose what works for them. I bet that's a lot more Subversion than git.
Heh. You could be right. I tend to do that myself, now that you mention it.
Religion is regarded by the common people as true, by the wise as false, and by rulers as useful.
CVS popularized concurrent versioning and many other ideas that we are taking for granted, and there have been distributed versions of it, too. Linus may think that git does everything different from CVS, but git owes a lot of its functionality to CVS.
As for Subversion, it does support distributed development via svk, and I suspect that's going to get integrated into Subversion.
You have no idea what you are talking about, do you? Any platform that supports C also supports Python, unless you count really tiny ones that do not have enough memory. So therefore, what you're saying is that more platforms support C than Python.
This solution seems chaotic to me. Now, instead of needing to pull all the changes from one central repository, I need to pull changes from the machines of all my co-workers individually? Wouldn't this system make it difficult to guarantee that each developer was integrating the work of the others? Also, it doesn't seem very scalable. What if I have 20 co-workers?
He's right about CVS, and more or less about SVN. Except for one thing: Subversion works. Not only in the technical sense, but in the sense that you can work with it, you can easily explain it to new developers, there is integration into lots of IDEs, code editors and other tools and the list goes on and on. (last, but not least: Trac!).
I used to be passionate about arch, for example. I'm fairly sure I would've been about GIT had it existed back then. But then I learned that to get real work done in the real world, the theoretical basis of your version control system matters little. If the system doesn't work for my developers - who like many projects are doing this for their fun and in their spare time - then it doesn't work, period. If I can't explain it to the boss at work, it won't get installed.
And that's why Subversion is everywhere and arch is, where exactly?
Now Linus is a man with his feet on the earth, so GIT may have a different fate. Wake me when Eclipse and Textmate have built-in GIT support and at least half of my potential developers know it.
Assorted stuff I do sometimes: Lemuria.org
Linus has written some very low-level subproject support, but at this point I think it's only interesting to very early adopters and/or people who can help hack on the higher-level infrastructure that'll be needed to make it usable.
Subversion 1.5 will have automated merge tracking. The merge tracking feature is already available on the trunk, and will be released in (probably) a couple of months. This link has more information:
. html
http://subversion.tigris.org/merge-tracking/index
With the addition of the merge tracking feature Subversion will be almost at parity (feature wise) with commercial products like P4.
Git may be a great SCM tool for some situations, but for commercial development which doesn't require a distributed architecture, IMO, SVN is preferable to Git.
Git needs to be supported on more platforms and have a better user interface to be accepted as widely as Subversion.
I must mention TortoiseSVN at this point, because until Git has a user interface like TortoiseSVN, it'll never be accepted by Windows devs. I'm not asking for trouble here (in other words, I'm not trolling Linux users), it's just that I looked at lots of alternatives for our (mainly Windows and UNIX-based) shop and nothing else came close. All of our devs use Windows, even if just to host a terminal program, and the ease of training folks on the use of TortoiseSVN was a big reason for our switch to Subversion.
http://tortoisesvn.tigris.org/
FWIW, I'm a UNIX/OS X/Linux guy myself.
This sig kills fascists.
Subversion does do all of that, however.
The distributed system is basically a centralized system where EVERY COPY HAS FULL REVISION HISTORY.
No. The fundamental feature of a distributed VCS is that you can ALWAYS commit your current state and get back to it.
Once you have this feature, any centralized VCS can trivially be converted into a DVCS because on every commit you _DO_NOT_CARE_ what anybody else might have done to the repository.
Infact, the problem that DVCS have to solve is that they do NOT have a full revision history because if you commit to the trunk in your repo and I commit to the trunk in my repo we are both completely oblivious to the other commit until we merge (my statements above immediately follow from this)
Tim.
God said, "div D = rho, div B = 0, curl E = -@B/@t, curl H = J + @D/@t," and there was light.
Flamebait? Gee, I should know better than to make jokes.
Support SETI@home
That's why I've started putting every line of code in its own file. I haven't had this problem since!
The Farewell Tour II
This isn't always possible. The company I worked for produced software that ran on medical instruments, unable to access the network for security reasons. You couldn't use centralized version control because you couldn't access a central server.
Tired of free ipod spam sigs? Opt ou
That's a valid concern, but the flipside is that DVCS allows you to commit early, and commit often. I often make small changes in my code, trying out different things, adding a function here, and it is not crucial for other developers to see these small changes immediately. However, they do see the changes -- every one of them, committed individually as I made them -- when I push to the server once I'm done working on a certain feature for the time being (at least once a day).
Tired of free ipod spam sigs? Opt ou
He went into this at the talk (and for a bit afterwards, when the cameras were off): To Linus, it's not that CVS and SVN don't "work the way he wants them to", they're fundamentally flaw in their designs. He's all about the distributed model vs. the centralized repository. In fact, his tech talk was more about the design rationale behind git as it was git itself. He simply thinks that the repository model is the absolutely wrong way to go about SCM.
He liked bitkeeper, and that whole fiasco caused him to look for options. He found none, and decided to implement his own. While he was doing that, he thought he'd throw in a few new ideas that he liked.
So it's not just about them not working right, or nobody liking a certain feature. To him, there simply wasn't anything out there that met his needs, so he wrote something himself. Kinda like, well, Linux.
Anyway, even though I'll not likely ever use git, it was cool to see him. He's certainly got some opinions...
-B
Ash and Hickory, straight-grained and true, make excellent bludgeons, dandy for the cudgeling of vegetarians.
OK, distributions (with FreeBSD essentially is) inherently have more lines of code. Compared to projects working on a single piece of software, Linux isn't that small. And it has a big active developer community, so the tree sees lots of commits.
Hm. I've been managing to mostly avoid CVS recently, but a few years ago we tried developing with a linux kernel tree in CVS. My memory is that some operations that should have been nearly instantaneous (cvs update to fetch the latest version, cvs diff to compare two historical versions of the tree) were painfully slow.
I guess my question was more along the lines of "what sort of policies do you need to enforce with ACLs". Such as "a developer can only check-in under a certain subdirectory", or a certain file, or whether you need to be able to prevent developers from even reading certain subdirectories... I ask because I participate in the mailing list for an actively developed DSCM; If there's a single feature that is keeping it from being accepted in the corporate environment, the developers would like to know. From an implementation standpoint, write restrictions are relatively easy for a DSCM, while read restrictions would require more work. Several DSCMs are working on the concept of nested repositories / super-repositories as well, with some of the same usage cases as goals that ACLs might otherwise be used for. The more feedback on whether that could be useful, the better.
Somtimes he does this and tries to make a joke out of it, but more often than not you can see the real venom shining through.
Oh well.
See the video again... you don't need ACLs in GIT because it's a "pull system". Instead of giving someone access to "your repository" (let's say you are the coordinator of a large software project) YOU decide from which people you pull... you can do it based on a web of trust ("I know this guy, he produce good code !"), on corporate politics ("I only pull from the department chief programmers") or even on a case by case decission ("Damn, this smart guy from the user interface team explained me the bug in the network code and how he fixed it... I should have a look at this code.").
I watched the video and was baffled by Linus' attitude.
The guy is bright on technology but what he calls strong opinions I -and almost any business person- should call them shortsighted opinions that do not appreciate other people's way of doing things.
Saying stuff like "Those of you that like SVN would probably want to leave the room. Because it's crap too." is very stupid. I for one like SVN. It's good enough for my purposes and it sure beats quite a few commercial tools I've seen. But I sure want to know hear Linus' thoughts on the topic. Mainly because different ideas keep my mind fresh.
I hadn't the slightest objection to his spending his time planning massacres for the bourgeoisie... (P.G. Wodehouse)
He said: C supports more platforms than python. You said: You have no idea what you are talking about; C supports more platforms than python!
--
WHO ATE MY BREAKFAST PANTS?
Stylish sheet to fix many problems in Slashdot's D3: https://gist.github.com/801524
.. and now that I've watched the entire thing, start to finish, and wrote a near-complete transcript of it, Linus gets tonnes of things wrong. Oh Linus.. why did you have to begin spewing such diatribes about stuff you visibly know nothing about? Talking about Git is one thing, but then to claim that other products don't do the stuff git does? Sigh. I mean really.. narrowing down changesets to a single directory? Pulling a report of what's changed between two dates or two revisions, narrowed to a single directory or set of directories? Everyone can do that! And you went on for like.. two minutes about how tremendously powerful that ability was, claiming that you "guarantee" that no other system can do that.
Sooo disappointed at how little research you've done.
The fundamental problem is that the smallest unit of operation if the 'file', where it should be 'the line of code'. Since we can check out files only, we have to merge changes. If we could check out lines of code, there would be no need for merging.
By ad-hominem, I mean using unique venom. Calling everyone in the room at Google 'morons' isn't ad hominem. Saying that, oh, Andrew Morton is the 'biggest lobotomy evar', or some other unprecidented slam, would be ad hominem.
Of course, I'm probably a collosal 'tard for tweaking the mean of 'ad hominem' like this, as I tend to think you're right.
Get thee glass eyes, and, like a scurvy politician, seem to see things thou dost not.--King Lear
I used to think that he was always joking around when he called people idiots whenever they disagreed with him... I used to think it was all a bunch of tongue in cheek bravado (partially because that's the sort of sense of humor I have). However, I was talking to one guy who got chewed out by him before, and it turns out he's usually pretty serious.
I suspect that this is the sort of thing you see mostly in open source projects, where some people, like Linus, are their own bosses and egos get big. I haven't been in the industry that long, but most of the developers I've met are pretty polite and measured.
That said, Linus is pretty bright, and when he's bitching someone out he usually manages to put together a pretty compelling argument of why they're wrong. I usually agree with him more often than not, even if I would explain it without calling anyone an idiot.
Right. A consultant that comes in and FIXES things...
As Linus said in the talk, when people say they want "cheap branching" what they really want is also "cheap merging" ... SVN has the former, which is worthless on it's own. If "fixing merging" in SVN was "just a matter of fixing the algorithms" why have the SVN/CVS developers failed to do it in the last 10/5 years? The reason is that how difficult the algorithms are depend on your storage model ... and the CVS/SVN storage models are broken. Also, even at GigE speeds, talking over the network is significantly slower than talking to the HDD.
Also I'd like to live in your world where you always have fiber type speeds to your central repo. ... and neither the network or the machine itself ever goes down. Don't forget about when I've just taken a plane flight to X, which is 1,000s of miles from where I normal am. But however much I'd like to, I don't live in that world ... and I find it hard to believe that I'll live in that world in my lifetime.
I've also had to "contribute" to more than one "project"[1] using a CVS/SVN repo. where I didn't have commit privs. ... I find it hard to believe anyone who has lived through this pain could argue that it's a good idea, or helps anyone. So you must also be extremely lucky in that regard. You apparently live a blessed life.
[1] Project here isn't codeword for OSS, this is exactly as painful in CVS/SVN/clearcase/perforce/etc. inside a company.
ustr: Managed string API with ave. 44% overhead over strdup(), for 0-20B
>> I'm one of the original designers/developers of Subversion, and even we (in the svn developer community) are well aware of both sides of the coin. We're seriously considering adding decentralized features to svn 2.0.
Sounds to me like what this community is proposing is evolutionary changes in version tracking and SCM, away from the prima donna model of one guy at his terminal behind the wall.
This work won't be accomplished in a new version of git, but over several lifecycles and updates to the core routines that define what version update routines are and how they operate in a distributed environment.
>> We've also added true merge-tracking magic to the imminent svn 1.5 release (so svn is no longer "hand waving" merges, they'll be just as simple as in decentralized systems.)
"Magic"? You must understand sussman, there is nothing simple about merges within and without the use of decentralized systems. What I just said is the reason they're decentral and are operationally redundant in manner of time and place. The reason distributed connections should be able to handle co-dependent nodes in decentralized ways is to beat the central office model. The reason we want to beat the central office model is whenever someone takes repository A offline or makes line edits under the old framework of subversion and CVS, everyone else loses their work.
My day job is a clearcase administrator, so I have seen more than my fair share of merges. I'd love to get rid of them, but this isn't going to do it. Not even close.
- doug
As a CC admin, I appreciate your comments. But I do have to say that CC is no where near as fast as what Linus reports git to be. I've never used git myself, but when we import an update from montavista, it is going to take a while. It happens to run on its own without human interaction, but it is going to chug for a while.
I'm a clearcase admin by day, so I think I have some first hand experience in this area. A single branch means a single development thread, which is not a real world scenario. For the project that I'm working on there is one release in long term maintenance, a second going from development to maintenance (GA release just happened), a third under development (will GA at the end of this year, or early next), and early prototyping for a fourth. How again is this all supposed to work on a single branch?
Each of these four releases has its own branch, with sub-branches for development/fixes. How would your system of a single branch handle merging fixes from maintenance releases to in-development releases? And when fixes get back ported, can a single branch handle backwards merge. Does that concept even exist? This is ugly, but it is real-world.
Please don't tell me that each and every release would be modeled as a different project. That would be the cure being worse than the disease.
At the very end Linus explains that branches aren't the issue, it is merging. In this he is "spot on". I don't think clearcase got everything right (I've spent too much time with "merge hell" to think that), but at least it is possible. His style is vitriolic at times, but if you look past it, he makes some very valid points. I have no first hand experience with git, so I don't know if his solutions work well enough for me or not, so I have to take his "solutions" part of the presentation with a grain of salt. But I don't remember any problems with the "problem identification" bits.
- doug
Subversion does not do merge tracking. They're hoping to add it in svn 1.5, but....
Can you explain what in particular he mentioned that you can't do with svn? I've used it a few times, including a few merges, and haven't really run into anything he described that I can't do. I'm interested in knowing more about its limitations before I run into them. A link would do, if you'd like. Thanks. :)
dude, wc -l counts blank lines and comments
I found the information I needed.
I'm using ClearCase at my present employer, and your complaints about CVS here (which I've also used in the past) sound exactly like my complaints about ClearCase. Except that, in my experience, CVS was much easier and faster to use than ClearCase. Why anyone would pay the huge license fees necessary for ClearCase when tools just as good (or just as bad, as it may be) are available for free is beyond me.
Personally, I like Subversion the best, since it addresses most of the complaints you have here (esp. atomic commits and changelists).
I haven't used Perforce, so I can't comment on that.
Actually, no, and for the same reasons file locking is an abysmally bad idea.
If you have to make a core API change, it's going to suck. Period. But if you lock the API that's used through the entire code (think resource allocation, for an example) that means nobody can work on anything, and all existing outstanding work has to be merged and stopped before you can even start to make the change. Refactoring is difficult enough as it is, you can't make the entire project grind to a halt over it.
We host lots of applications (over 50) on the system we have, with folks all over the world using it. speed has never been an issue for me at all.
it's a great system when you have a good admin. I can't pimp it enough.
PHP is the solution of choice for relaying mysql errors to web users.
We just moved to perforce. From alien brain.
.. tons of modelers, animators, texture artists, designers, illustrators.)
.. compiled for multiple hardware (ps2/gc/wii/xbox .. any combination, depending on the game) and to me, its a crazy test on code management software.
You want to test a code management system? Try working at a games company. We're not just talking programmers (or code) here. We're talking versioning artistic assets, the works. Many people who use the system are not programmers (programmers are a small part of any video game
Combine that with the fact that you have to produce stable builds 60 days after you start the project, one a month, for a few years
Alien brain was madding as hell when it came to anything more than checkouts, checkins. Perforce has a really nice command line interface, and a decent (but not as good as alien brain) gui.
The thing I will miss from Alien brain is that you could create a custom layout locally of the gui, much like Visual Studio (although clunkier.) The thing I'm looking forward to in perforce is atomic change sets and from what I understand so far from my limited use of it, decidedly better branching. I bet dollars to donuts that our artists like Alien Brain more, but p4 seems far more capable of the tools you need to keep the build stable.
"Old man yells at systemd"
... and the non-networked medical instruments had a functioning development suite on them? Bzzzt. You develop on a networked PC and load the software/firmware to the instruments via whatever means. If the medical instruments have a PC controlling them, you throw a goddamned nic card in your development box.
Or are technicians supposed to say "whoops, I can fix that glitch!" as they're examining someone, drop out of the exam, fire up the IDE, fix the bug, re-start the program and complete examining the patient?
Yeah, didn't think so.
Yes, in some cases we (not me, I was just a contractor) loaded development tools, including an full IDE, onto the medical devices for development. The devices that go out to the customer do not have these tools loaded, of course.
Tired of free ipod spam sigs? Opt ou
If you have to talk to a server over slow links, decentralized is much better
I have recently used SVN and Microsoft TFS, and in this respect SVN is a clear winner since it keeps version history locally, and you only need to connect when getting or committing updates. I've done work on a laptop on a train with no network at all. SVN didn't bat an eyelid. TFS, by contrast, throws a hissy fit if it can't get to the version server.
My Karma: ran over your Dogma
StrawberryFrog
Of course, in theory, there's no meaningful difference between theory and practice, but in practice, there is.
Well sure it seems chaotic and different because you're not used to the idea. Assuming you're working with just 1-5 other people, it's a fairly simple cognative load. Heck, you'd probably even script it, so it'd just happen.
I submit to you, the reader, that the subversion method is pretty chaotic too. Because of the "Thunderdome" style of launching all patches into trunk without regard for if the build works, it can be really unclear if your checkout works, has passing tests, or any other thing. All you can do is hope the logs are accurate. And this assumes you have the patience to wait for SVN to tell you these things, given how slow it can be.
To me, that's one of the worst case scenarios. Because responsibilities are often delegated in this kind of situation, you seldom have any idea about the code that's being worked on "over there." So if it breaks in a reasonably-sized project, you're somewhat screwed.
There is almost exactly the same amount of integration work. The difference is that you can defer it, or foist it off on other people. These other people may find a merge that baffles you to be utterly trivial.
Actually, it works fine. What you do naturally with such a large group is that you begin to delegate. You say, "It's Alice's job to get everyone's patches for this component, and she, Bob and Carlyle with be working on that part." Their patches feed up through her, and then you pull from Alice, confident that she's dealing with that part of the software.
Obviously git scales to the large-delegated-group solution, that's where it's being used to greatest effect (i.e., the linux kernel).
Slashdot. It's Not For Common Sense
- distributed (hundreds or even thousands developers working same time, multiple dev teams working on different modules etc)
- reliable and secure (I'm releasing versions quite often and those releases have to work)
- fast (I have better things to do than wait)
what alternatives do I have?