Interview with Tom Lord of Arch Revision System
comforteagle writes "Every revision control system has its supporters and detractors, but none is as polar as Arch. Either you hate it or think it is the best thing in revision control ever. Built more around what our beloved kernel hackers use (BK), Arch is definitely a departure from CVS and Subversion. I've interviewed Tom Lord, Arch's daddy, about the application, and he has some -ahem- interesting answers and opinions."
They forget those of us who have never heard of it before.
:(
And those of us who have heard of it, but have no idea if its a good thing or not.
I noticed freedesktop.org has started using it to some degree. But like I say, I have no idea if thats a good thing. It is slightly inconvenient in that I have to go read yet some more docs to use it.
Well, he slams the subversion design pretty good. I don't know anything about the design of subversion of either Arch or Subversion to comment on either - maybe someone else can, but subversion seems to be gaining quite a following from what I've seen.
Look at the way the Linux kernel project works, at least for developers who are willing to drink the koolaid of Bit Keeper (BK) licensing.
I guess that's a different koolaid than what the Stallman/Gnu cult members are drinking.
GNU arch was awarded an Open Source Award last quarter.
As ever people OSI is accepting nominations for OSAs.
John.
I think the most polar source control system is Rational's ClearCase. You really love it or really hate. It's a very complex software package, but very powerful.
Personally, I really like ClearCase. Too bad its so expensive, otherwise I'd use it for all my open source work.
don't we get enough marketing droids that can't ever say what they mean? I agree he was upfront, blunt, and brutal but in the end he didn't seem crazy or wild or unreasonable. He even backed up some of his more inflammatory statements. I think he was a very good interviewee. He did seem to be a little too forgiving to his project own weaknesses but that's is not unexpected and relatively forgiveable.
Your CPU is not doing anything else, at least do something.
The guy tends to use strong words when describing the flaws of cvs/svn. However, he gives no details. There seems to be a lot of talk with little information.
Even if this arch thing is good, i am not going to switch for two reasons: i am happy with cvs, being aware of its drawbacks, switching to a better system is not critical; i am certainly not impressed by what its author says.
Perhaps i should have given them the other way around.
Tom Lord has tried to work more closely with other revision control packages before (including the subversion team) but he has been hampered by his complete and total lack of people skills. I don't think he tries to, but he ends up offending everyone he tries to have a "discussion" with. Its comical and sad at the same time.
> As to svn backends... I think it is prudent to
/. comments.
> point out a false statement made by Lord.
> [Hey, FSFS exists.]
I agree it is good to point out FSFS. The
interview is, indeed, misleading in that
respect.
As far as I know, back when the interview was
conducted, FSFS did not exist or at least was
not on many radars.
A separate question is whether or not FSFS
really makes the server-side of svn all nice
now or not --- but certainly that is not going
to be worked out in
-t
Ok, I admit I just want to get darcs mentioned here, but I really want to know what Tom (as well as Larry McVoy) thinks about darcs. In particular, whether the theory will stand up to real use and scale to large projects. I have a hunch that David Roundy has discovered much of what Larry McVoy said was a dozen PhD theses worth of research behind BitKeeper.
The evaluation of an action as 'practical' . . . depends on what it is that one wishes to practice.
I dunno, the best software I've seen has come out of derision of bad software. I don't think the creator of Postfix loved sendmail too much. Many people dislike BIND and have come out with arguably better alternatives.
The other extreme is just developers who hate the popular software just for the sake of hating popularity. That seems to be the case with DSpam over Spamassassin. I don't think that's the case here however. While CVS is reliable software and people know how to work around its flaws (and the creator of arch fully admits that) it is at the same time fairly flawed.
I'd tend to agree that CVS is klunky in the way he describes. I still use it of course since it gets the job done. I've not tried subversion at all, so I can't comment on how well that fixes the problems of CVS.
AccountKiller
Subversion 1.1 has support for a normal filesystem backend instead of BerklyDB. See release notes.
It's now possible to create repositories that don't use a BerkeleyDB database. Instead, these new repositories store data in the ordinary filesystem.
if someone can tell me of such a tool which can handle filesystem ownerships and permissions (in the context of Linux, in my case), and version them, I would like to hear it.
At the moment I am using subversion because it has versioned properties and I wrote a bunch of scripts to extract filesystem metadata and create svn properties from them and vice versa.
We have at least one arch fanatic where I work and when I asked him about this, he seemed to think that using arch for what I want would be *fantastic* and arch would rule, only I'd have to use the cvs method of maintaining ownerships and permissions, ie a script which maintains them in a file which is in the repository. Which I tried and which sucks.
In the free world the media isn't government run; the government is media run.
This isn't as much "normalization" as it is "don't take so many drugs when you're designing tables."
I use Subversion at work with a large (half-gig) source tree, and primarily on Windows (with TortoiseSVN). We use the new FSFS backend. Seems to do quite well, even on a networked filesystem.
Those who do not know the past are doomed to reimplement it, poorly.
A quote from an email conversation with an unnamed Arch user in January: "I think Arch's biggest bug is the one up the developer's collective asses."
This article is a good example. Tom Lord just hand-waves his way past every question. Subversion sucks!!! CVS users are teh stupid!!! If he tones it down a bit, he definitely has a future in politics. But I don't think he's a very good software architect.
OK, it's true that CVS and Subversion have problems. But, gak, so does Arch. Good God is it slow for big projects (something they've been promising to fix for years). And it's got some horrifying naming conventions: "tla--devo--1.3". And the files! "{arch}", "++default-version", ",,inode-sigs". Whatever Lord was smoking, it must have been good. The branching and merging operators are powerful but, thanks to all the punctuantion, they are also ugly. It's like the entire UI goes out of its way to be downright unfriendly.
Every time someone mentions these deficiencies on the mailing list, they just get flamed for not truly understanding Arch. "Namespaces! Namespaces! Namespaces!" "Win32 is for lusrs!" Whatever. I just want a tool that helps me get the job done.
Personally, I'm in the middle of transitioning to Subversion. It's better than CVS, and it is faster and nicer to use than Arch. Works for me.
What struck me as interesting about his comments is he only admitted to one flaw in Arch and he sort of mumbled it out: "...performance...won't bother most users...yadda, yadda, yadda".
I find it hard to believe that Arch would be so perfect. If he really knew the strength of his software he would also have no problem admitting to its weaknesses and Arch would be that much better for it.
Instead he spent most of the article attacking Subversion. If Arch is really that good, why would he spend so much time complaining and critiquing something else?
The article says that Tom Lord claims that a comprehensible interface for arch should be ready by the end of the year. Arch really is the right design, and will be ideal once there's a sane interface.
I refuse to use a software application where I have to invoke the author's initials to specify commands.
Tom, change your name
you narcissistic f'er
I'd be interested to hear if anyone has actually gotten happy with distributed development under arch. I tried a reasonably simple case a few weeks ago, and couldn't get it to feel right.
What I was trying to do was to have a two-layer revision control system, where I have a private archive in addition to the project archive, and I check into the private one all the time, and transfer changesets to the project archive when I'm happy with it. That way, I can be halfway through refactoring a big chunk of code, have it completely broken, but have the work so far revision controlled so that, if I accidentally wipe out my build tree, I can recover it.
The problem I ran into was that I couldn't get the two archives to agree exactly on the current status: whenever I transferred my changes up from the private archive, it added a log message to the project archive, and my private archive wasn't up to date, because it didn't have the message. When I updated my private archive from the project archive (either to pick up the message or to get other people's changes), I had to put in a log message, which the project archive then didn't have.
It seems like arch really ought to support getting two archives in perfect sync, as well as disregarding a commit to a remote archive that only adds changesets already in the local archive (as well as disregarding the changesets themselves, which it does do).
"I think CVS is the best of a pretty poor bunch at the moment - it may not be flashy, but it works. Subversion looks nice, and is mostly a better cvs, but it seemed to be a touch flaky with large (>1Gb) trees when I last tried it (getting itself into a corrupt state). It also used to let you check in files with bad filenames and then protest when you tried to check out. And lots of little things are essentially undocumented so you're forced to rely on the mailing list too much. I'm not thrilled about aspects of the design either."
CVS is, quite frankly, ass! On tagging it can _seem_ like it's tagging successfully (T 'filename') and even a handy exit code of 0. But then when you go to actually use your tag you may get some, all, or none of the revision of the files the tag was supposed to be applied. It's not atomic in anything it does. On subversion operations succeed or they don't. Not some sort of throw the dice and see what actually got checked in/tagged/branched/whatever and what didn't.
You get better log output, super fast execution, and much much better branching.
The comment from, Lord "numb-nuts" of Arch, about svn being a toy is asinine. Bdb isn't the worst thing there is. And there's work to provide a choice. There's work progressing on using *sql as the backend storage. There's also work to give one the option of using a plain filesystem like CVS. If you're sick and like that kind of thing.
In every way imaginable Subversion is superior to CVS. I have gone through the hell of having to work around CVS' failings. I have also experienced how much life is with subversion.
Our repository is just 1.2GB right now. I've not experienced any "flaky" behavior whatsoever. It is, hands down, the better scm tool.
If all code, binaries, and everyone involved in the creation and/or continued propagation of CVS were to be de-res'ed, the world would be a better place.
For an opensource scm tool subversion is the way, the truth, and the light.
If you want to talk about the best scm tool, bar none, that would Clear Case. Truely a best in class application. Although it wouldn't hurt them to step into the latter half of the '90s and get rid of motiff as their widget set for the gui frontends. But that's only if you care about the gui, right?
Do any of these systems have good support for automatic conflict-resolution? While we don't run into conflicts often, their annoyance is compounded by the obviousness of their resolution (that is, yes, it's easy for us to fix, but why should we have to?) We're still using CVS (oh, stop laughing already) ... does anything else have support for (and preferably already-implemented) rules to auto-resolve conflicts?
Arch really does need a 'simple' mode for new people. It certainly took me longer than 20 minutes to get going well, and then a lot longer before I really got good at it.
The thing is, being really good at arch is more productive than being really good at svn.
I think arch supports a much better model for opensource development than svn. Because it is a distributed model. So while *I* might have the offical release of a project, if someone else wants to download and hack on it, they get to keep their changes in a revision control system, and I can easily merge their changes back. And if they keep developing, I can keep updating without them having to worry about what patches I accept and what I reject.
It also supports maintaining multiple development branches much better. (You have a --dev tree, and a --release tree, where each one is evolving, hopefully one faster than the other.) With CVS, you pretty much only have a branch to eventually merge it back to HEAD. My understanding is SVN is a little bit better about it, but they still don't natively support doing more than 1 merge between 2 trees and automatically detect what has been merged in the past.
I got interested in that project awhile back as I felt that daily version control is really holding back my project development. Unfortunately at the time arch was not Win32 ready (is it now?) and my Windows mindset coworkers already categorized CVS as "piece of shit" compared to MS Visual Sourcesafe (so there was no place to start that debate again). I think that the quest for an intuitive (for all user levels), revolutionary (merging of forked projects or selectively applying patches is difficult), Free (as in freedom, but also as in corporate PHB show me the money), stable (is it a one-man egotrip or really value-added? what's the story with the restraining order? (I'll burn for that, especially today)) and popular (is there a place for new players in the version control market? Was that interview and the slashdot followup really such a good PR?) Is still on.
IMO, svn's use of berkeley DB as its backend, an opaque, non-human-readable, non-human-recoverable, non-machine-portable* database, is its biggest shortcoming...
I still use svn, though. I'm just glad to be able to rename directories.
I'd pee myself if someone forked svn and gave it a more friendly backend.
-Ed
*By this, I mean that you can't take the berkeley DB, copy it to another machine, and expect it to work... the internal byte order is machine specific.
If you wanted to use Arch, but it's too complex, then you should try darcs. It has fully-distributed operation, but you can get up and running in much less time. Commands have a closer resemblence to what you're used to in Subversion or CVS: "darcs record", "darcs revert", "darcs diff", etc.
The best thing about darcs is that every operation is local by default. Subversion does diffs locally; darcs does everything locally. You only need to wait on the network when you want to get something not on your machine, or when you want to share your work with others. Arch can be made to work this way, but it requires a bit of setup and a lot of understanding of advanced concepts: mirrored archives, revision libraries, etc. With darcs, fast is the default.
The main downside is that it's still pre-1.0, and so a bit less stable and documented than Subversion, though still reasonably good.
I have a huge amount of respect for him. He taught me that compromise is way overvalued.
Huh? Did you read the same mails as I? Back then, Tom Lord's ramblings on the svn-dev mailing list had the same problem as this interview. And also those the grandparent complained about:
What exactly is bad about Subversion? Give me an example scenario that shows me just how fucked I would be with svn and how Arch would ride in on a white horse and save the day.
TL talked big about how Subversions design was broken but when asked to give concrete examples he always kept talking about theories.
IMHO, it's not much unlike saying that Linux sucks because it isn't a micro-kernel architecture. And when being asked about details, being unable or unwilling to come up with an example how a micro-kernel design would fix an existing major flaw (without sacrificing the existing good points of the software).
For example, I like QNX's design very much. But that doesn't imply that Linux is broken or sucks. Both have their strong and their week points dependend on the task at hand. (And for my daily desktop work I would fall into a crises if I had to use QNX instead of Mandrake due to some QNX usuability issues... oh wait, that reminds me of arch!
Keep an eye on which arguments are silently dropped in replies. Not always, but often times it's very telling.
I've read a bit about CVS/Arch and their possibilities. It seems that CVS for example, is very widely used. What do you think one should use for a '/etc' versioning sytem? Since i'm still new to this i'd might as well learn the most flexible tool i guess.
This is one of two big things I miss from VMS/TOPS-10. File versioning was very valuable. The ";n" filename versioning worked surprisingly well considering it was such a simple implementation. (For the uninitiated, VMS automatically maintained the most recent version without the ";n".) I wish *nix had this.
TOPS-10 (not sure about VMS) also had project as well as programmer permissions - kinda like groups but more powerful and useful. Once logged in as a user, you could change projects. Your login would look like, e.g., "user[alex:kerneldev]. Thus files and directories were owned by a project as well as a user, and the system maintained accounting data for both. It was easy to allocate and track work time and resource utilization to projects.
The third big thing I'd really like to have is the transcripting facility in the Perq workstation's text editor. (Perq was an ancient workstation - I have three, will consider selling them as I need the $$.) The editor maintained a transcript of all changes made to the file and stored them on disk. In the event of a crash this transcript could be replayed while you watched. Besides being interesting to watch your own work in fast-time, it allowed recovery from the beginning up through the last block saved. VIM has a short transcript/replay, but it's cumbersome to use for anything more than a few keystrokes. It also has a basic recovery capability but doesn't work as well as this. I dunno about Emacs these days. I once restored a marathon 36-hour programming session (deadlines breed insanity!) using replay. The ideal would be a kind of 'tape' feature in the editor, which one could fast-forward and rewind by using a GUI, and grab that part where you wrote a nifty bit of code (or text), but then backtracked and went a different direction, and now you need that nifty bit.
It's easier to be a result of the past, but more fun to be a cause of the future! http://www.spacefinancegroup.com/