Does that include running a script after a check-in?
Absolutely. A hook script runs when a person first begins to do a commit. That is handy for user-level authentication steps (beyond what Apache itself provides). A second script runs after the entire commit has been delivered to the server, but before it is truly committed to the repository. This allows you to analyze what changed and accept/reject the commit, or even to make additional changes. After the commit has been accepted, then a third hook is run which allows you to do whatever you need (emails, replication, analysis, indexing,...).
What about tagging revisions?
Subversion doesn't support tags. Instead, we support cheap copies -- just copy the bits to, say,/tags/release-1.0/. That is effectively an O(1) operation and doesn't consume any real disk space. However, this does imply another commit to create the new tag subdirectory.
For your scenario, I suspect what you would want to use are "revision properties." Each revision that gets committed can have (arbitrary) properties associated with it (and the properties are not versioned). Thus, your post-commit script could kick off your build farm. As each build completes, you can add a new property to the revision. When all are complete, then you kick off the regression tests. Upon a pass, you add a final property (and possibly remove the build props).
At this time, however, and it isn't on the plan, is a way to update to a revision that matches a particular query. However, a CGI script could easily be written which would query this on your SVN server and return the specified revision. You could then update your working copy to that revision. Heck, you could write a client-side script that would connect to the server to do the query, then pass the result to your 'svn update' line.
So I'd say the data model is there, but the wrappers you're looking for are not built in. But due to SVN's great scriptability and library-based design, you can easily grab and/or modify the data you need.
Since Subversion is effectively a WebDAV server, you can mount a Subversion repository on your Windows (Web Folders), MacOS X (native), or Linux box (davfs). Then you can feel free to use your normal tools on it.
(for the moment, it is readonly when mounted like this, but SVN 2.0 will enable read/write (unless some enterprising code jockey does the week of work necessary))
The Subversion server is built on top of Apache, which runs "everywhere". Our only real limiting factor is where Berkeley DB runs, but that seems to be everywhere, too.
In fact, Apache 2.0 on Windows is quite the neat little beast. It keeps up with IIS(!) Not bad for an Open Source project which doesn't get to put its code into the kernel [like IIS does]. And Apache and Subversion are native Windows apps. No need for cygwin or vmware.
Note that Subversion is designed as a set of libraries. The command-line client is just a thing app that connects those libraries to the command line. A GUI app can use the same libraries to create a GIU app. Via the scripting language bindings, you can use those APIs to do whatever.
Note that this doesn't normally "pollute" the GUI in Subversion. We have a much more flexible structure and access into the repository. Most of the time, people check out/trunk/. All of the branches that people create are over in/branches/. Thus, most users will never see (all) the branches unless they specifically go looking for them.
Yes, we are similar to SourceSafe, in that we make cheap copies to branch the code.
However, the original writer is missing a *very* important point. CVS branches are not simple and cheap. To branch, every,v file in the repository must be entirely rewritten to insert a couple tags at the front of the file. If you're branching a repository with a few gigabytes of source, then this is a hugely expensive operation.
Subversion can branch a 10 gig repository in less than a second:-)
Subversion handles most of those items. It does not (yet) come with a graphical merge tool (#4), although writing a client should be easy, and that client can easily integrate one. For a web view (#6), we have very basic viewing of HEAD via your web browser built in, but will be relying on an external ViewCVS-like tool for complex browsing. Also, we do not yet have #11, but it is easily supported by the architecture. All the rest we have.
Subversion works great in a disconnected environment. While offline, you can get status, and do diffs, adds, and deletes (cvs needs the server for all of these). However, SVN does not have multiple repositories, so you cannot do offline commits or checkouts/updates.
"catch up" is implemented by "svn update" or "cvs update". It merges others' changes into your local working copy. Then you have "svn commit" or "cvs commit", and both systems ensure that nobody else has changed stuff between the update and commit.
Subversion is designed as a set of libraries. There should be no problem integrating Subversion directly into an IDE (none of CVS's "fork a process" crap and hope we can parse the output).
In addition, we have Python bindings already (via SWIG) and will have bindings for all other interesting languages when we release 1.0 (Java, Perl, TCL, Ruby, Guile, etc), also through SWIG. That should simplify integration into things like Eclipse or NetBeans.
Please feel free to examine the CVS source code. After a few months of trying to understand it, then you'll feel comfortable to make a change. Five things will break. You'll fix two and create another problem somewhere else. Scratch your head, and fix that somewhere else, and two more problems will appear. Now you have six problems introduced, and your feature that you added still doesn't work.
Yes, but "over SSH" means that you have to pass out system accounts to everybody who needs to commit to the repository. This specific problem with CVS over SSH blew up SourceForge, and they had to switch over to a PAM/LDAP system to handle the account load.
Subversion doesn't require system accounts. Its authentication is handled by Apache, which means it can tie into any sort of account database. Flat text files, LDAP, MySQL databases, Kerberos, NTLM, or whatever. With SSL for encryption and client/server certificate verification, you've got the security meeting or beating SSH-based access.
Subversion has a CVS converter (partly done; it converts the main trunk of a repository, but no branches or tags). I would be very interested in throwing it at your CVS repository to see if it works. The RCS parsing is performed using ViewCVS's tools, so if the latter works on your repository (specifically, the annotate/blame functionality), then cvs2svn ought to work.
We've also been doing some testing with portions of the GNU tool chain CVS repository. That is definitely an old repos:-)
Before we declare cvs2svn "done", we'll be throwing the entire GNU repository and the ASF repository at the thing. When it passes, then we'll consider ourselves done:-)
Yup. Linus pretty much started with a version control system that he wants (Larry designed it with an eye towards Linus), and with further development, it will simply narrow in even more.
Subversion's user model is like CVS, and since Linus doesn't like CVS, I doubt that Linus will ever use Subversion for his work. If he does... great.
However, I do believe that the model that Subversion (and CVS) uses are very applicable to many users. Just look at the number of CVS users. People should choose the tool that works best for them.
We have yet to do some throrough analysis of Subversion's performance characteristics, but we're quite hopeful. We've built the code on a number of tools that should give us excellent headroom for scaling it up.
First, the repository itself is based on Berkeley DB. Very fast, transacted, capable of hot backups, etc. Unfortunately, we can't really scale this across machines, so (until we get a traditional relational backend) we'll reside on a single box. But given its maturity, its use of shared mem to coordinate multiple threads/process, I think we're going to be able to make great use of it to scale up and (well) across a multiple CPU system.
Next up the design stack is Apache 2.0. We can scale this through threads and processes in whatever way fits the machine and operating system the best. We've got process preforking, fixed thread pools, or dynamic groups of processes and threads. It works on "all" operating systems out there, taking advantage of the characteristics of that platform (there are custom processing modules for Windows, OS/2, BeOS, and NetWare, plus the standard set for Unix and its variants). Compared to CVS which forks multiple processes each time you connect (2 via pserver, or 1 SSH + 1 cvs via ssh), Apache just has processes/threads waiting for processing.
Last up the stack is caching proxies. Since the network protocol is based on HTTP, and we have made use of WebDAV/DeltaV, we have a firm specification on how to mark resources as permanently cachable. For large sites like SourceForge, where a huge amount of the traffic is anonymous checkouts, having reverse caching proxies "in front of" the Subversion repository will offload a lot of the work of delivering basic content down to the user's machine. Workgroups can also install caching proxies at their network edge (or even within their department) and access remote repositories through it. The first guy in who does an "svn update" will prefill the cache, making updates for the next people operate at LAN-speed. For geographical disperse teams, this will be a big win.
Please note that Subversion is designed as a set of layered libraries. We are also implementing language bindings to those libraries (via SWIG). At the moment, we have Python bindings to most of our libraries.
This scripting support works from the low-level access to the versioning database, all the way up to the high-level client library. (the command-line client is just a thin app glueing stdin/stdout/stderr and the cmdline to the libsvn_client library)
The command-line client app also ensures that its output is machine parsable.
You want scriptability? Hoo. Subversion has it.
[ I'm just waiting for somebody to take the Python GTK bindings and the bindings to the client library to build a cool GUI app ]
You don't need to run a separate SSH tunnel (or even have an account on the server) to run Subversion securely. Since Subversion uses HTTP, we simply use SSL for our authentication and encryption.
For example:
$ svn co https://mysite.example.com/repos/project/
That said, we've had a couple people interested in writing a "proxy" to handle CVS requests and map those to the Subversion repository (or to the protocol and a remote repository). I can't see that it could allow full interop, but it should be able to support checkouts and updates (and other readonly operations).
By conflict, I presume you mean "what happens if you check out, make a change, go to commit and find somebody else has modified a file after your checkout." I can't answer for arch, but Subversion is like CVS: the commit simply fails. You need to update your working copy. If the changes to the conflicting file cause (patch) conflicts with your edits, then you'll need to resolve them before committing.
SVN is a bit nicer than CVS, though, in that the conflicts are stored in a.rej file rather than inline with the code. You have to remove that.rej file to let SVN know that you've handled the conflict before a commit can proceed. (this prevents the cases where people commit conflict markers into CVS).
And yes: in Subversion, our revision numbers are per-commit; files do not get their own numbers. 'svn log' can show you the log messages for all commits (no need for ChangeLogs or cvs2cl.pl any more; well, unless you want to ship a changelog, in which case you do 'svn log > ChangeLog').
Subversion was designed to use Apache 2.0 as the network server. Among the many benefits are optimized network usage, conforming to the platform's best use of threads/processes, logging, monitoring of child processes, effective resource usage and recovery, etc. We fully intend SVN to be "blindingly fast" as you put it. No forks, no thread creation necessary on the server: Apache is ready and waiting for that request to come in:-). Mix that with on-the-wire compression, and diffs in both directions, and things ought to zip along quite nicely.
Regarding backup/replication: since SVN is based on BerkeleyDB, we get its "hot backup" feature. You can do a full backup without stopping readers or writers. Presuming that you are mirroring the Berkeley.log files, you won't lose any information when you restore from backup + mirror. And with atomic commits, the client won't mark the working copy as "committed" until the server says it was. So if it goes down partway through the commit, just do it again after recovery.
(and note that we get all of Berkeley's replication, recovery, backup, blah blah blah tools)
It may be interesting to note that you can do an "svn commit" to check in a change to a.html file and have it immediately appear on your web site. In fact, SVN uses a URL to specify the repository to check out. That URL can be your website. For example:
$ svn checkout http://mysite.example.com/ -d site
$ jed site/index.html
$ svn commit -m "more tweaks" site
Your tweaks are immediately published.
(of course, it sounds like you want a staging server in there, and some kind of workflow, but that can be done and is an exercise for the reader...:-)
Yours is far from silly... I read the site every day or two. It is a fantastic site for information about the kernel. (although I wish Myrdall would comment a bit more rather than simply say "updates" all the time... I mean "duh"... tell us what happened:-)
Please keep up the excellent work. I might even suggest that you think about how you can delegate portions to other people. At least to the extent that you can say, "Hey John Doe, could you figure out and implement how I can automate XYZ?" I bet you would get several takers, myself included.
damn. just re-read the release. I misinterpreted the thing on my first reading... Seems like they probably will count on the commercial shipments, rather than all installs. sigh.
They're probably going to need to revise their strategy, though, given Linux's "buy once. install many" strategy. That just isn't seen for the other operating system.s
For your scenario, I suspect what you would want to use are "revision properties." Each revision that gets committed can have (arbitrary) properties associated with it (and the properties are not versioned). Thus, your post-commit script could kick off your build farm. As each build completes, you can add a new property to the revision. When all are complete, then you kick off the regression tests. Upon a pass, you add a final property (and possibly remove the build props).
At this time, however, and it isn't on the plan, is a way to update to a revision that matches a particular query. However, a CGI script could easily be written which would query this on your SVN server and return the specified revision. You could then update your working copy to that revision. Heck, you could write a client-side script that would connect to the server to do the query, then pass the result to your 'svn update' line.
So I'd say the data model is there, but the wrappers you're looking for are not built in. But due to SVN's great scriptability and library-based design, you can easily grab and/or modify the data you need.
Have no fear. Yes, you can use it on your local filesystem (as a non-root user), without a server running.
We say "natively client/server" simply because it was designed to operate over a network. CVS had network support bolted on.
Since Subversion is effectively a WebDAV server, you can mount a Subversion repository on your Windows (Web Folders), MacOS X (native), or Linux box (davfs). Then you can feel free to use your normal tools on it.
(for the moment, it is readonly when mounted like this, but SVN 2.0 will enable read/write (unless some enterprising code jockey does the week of work necessary))
The Subversion server is built on top of Apache, which runs "everywhere". Our only real limiting factor is where Berkeley DB runs, but that seems to be everywhere, too.
In fact, Apache 2.0 on Windows is quite the neat little beast. It keeps up with IIS(!) Not bad for an Open Source project which doesn't get to put its code into the kernel [like IIS does].
And Apache and Subversion are native Windows apps. No need for cygwin or vmware.
Note that Subversion is designed as a set of libraries. The command-line client is just a thing app that connects those libraries to the command line. A GUI app can use the same libraries to create a GIU app. Via the scripting language bindings, you can use those APIs to do whatever.
Note that this doesn't normally "pollute" the GUI in Subversion. We have a much more flexible structure and access into the repository. Most of the time, people check out /trunk/. All of the branches that people create are over in /branches/. Thus, most users will never see (all) the branches unless they specifically go looking for them.
Oh, I knew SourceSafe, for better or worse :-)
,v file in the repository must be entirely rewritten to insert a couple tags at the front of the file. If you're branching a repository with a few gigabytes of source, then this is a hugely expensive operation.
:-)
Yes, we are similar to SourceSafe, in that we make cheap copies to branch the code.
However, the original writer is missing a *very* important point. CVS branches are not simple and cheap. To branch, every
Subversion can branch a 10 gig repository in less than a second
Subversion handles most of those items. It does not (yet) come with a graphical merge tool (#4), although writing a client should be easy, and that client can easily integrate one. For a web view (#6), we have very basic viewing of HEAD via your web browser built in, but will be relying on an external ViewCVS-like tool for complex browsing. Also, we do not yet have #11, but it is easily supported by the architecture. All the rest we have.
Subversion works great in a disconnected environment. While offline, you can get status, and do diffs, adds, and deletes (cvs needs the server for all of these). However, SVN does not have multiple repositories, so you cannot do offline commits or checkouts/updates.
Umm... hello!?!
"catch up" is implemented by "svn update" or "cvs update". It merges others' changes into your local working copy. Then you have "svn commit" or "cvs commit", and both systems ensure that nobody else has changed stuff between the update and commit.
Subversion is designed as a set of libraries. There should be no problem integrating Subversion directly into an IDE (none of CVS's "fork a process" crap and hope we can parse the output).
In addition, we have Python bindings already (via SWIG) and will have bindings for all other interesting languages when we release 1.0 (Java, Perl, TCL, Ruby, Guile, etc), also through SWIG. That should simplify integration into things like Eclipse or NetBeans.
Please feel free to examine the CVS source code. After a few months of trying to understand it, then you'll feel comfortable to make a change. Five things will break. You'll fix two and create another problem somewhere else. Scratch your head, and fix that somewhere else, and two more problems will appear. Now you have six problems introduced, and your feature that you added still doesn't work.
Really. Try it.
Yes, but "over SSH" means that you have to pass out system accounts to everybody who needs to commit to the repository. This specific problem with CVS over SSH blew up SourceForge, and they had to switch over to a PAM/LDAP system to handle the account load.
Subversion doesn't require system accounts. Its authentication is handled by Apache, which means it can tie into any sort of account database. Flat text files, LDAP, MySQL databases, Kerberos, NTLM, or whatever. With SSL for encryption and client/server certificate verification, you've got the security meeting or beating SSH-based access.
Subversion has a CVS converter (partly done; it converts the main trunk of a repository, but no branches or tags). I would be very interested in throwing it at your CVS repository to see if it works. The RCS parsing is performed using ViewCVS's tools, so if the latter works on your repository (specifically, the annotate/blame functionality), then cvs2svn ought to work.
:-)
:-)
We've also been doing some testing with portions of the GNU tool chain CVS repository. That is definitely an old repos
Before we declare cvs2svn "done", we'll be throwing the entire GNU repository and the ASF repository at the thing. When it passes, then we'll consider ourselves done
Yup. Linus pretty much started with a version control system that he wants (Larry designed it with an eye towards Linus), and with further development, it will simply narrow in even more.
:-)
Subversion's user model is like CVS, and since Linus doesn't like CVS, I doubt that Linus will ever use Subversion for his work. If he does... great.
However, I do believe that the model that Subversion (and CVS) uses are very applicable to many users. Just look at the number of CVS users. People should choose the tool that works best for them.
I'm just hoping many people choose Subversion
We have yet to do some throrough analysis of Subversion's performance characteristics, but we're quite hopeful. We've built the code on a number of tools that should give us excellent headroom for scaling it up.
First, the repository itself is based on Berkeley DB. Very fast, transacted, capable of hot backups, etc. Unfortunately, we can't really scale this across machines, so (until we get a traditional relational backend) we'll reside on a single box. But given its maturity, its use of shared mem to coordinate multiple threads/process, I think we're going to be able to make great use of it to scale up and (well) across a multiple CPU system.
Next up the design stack is Apache 2.0. We can scale this through threads and processes in whatever way fits the machine and operating system the best. We've got process preforking, fixed thread pools, or dynamic groups of processes and threads. It works on "all" operating systems out there, taking advantage of the characteristics of that platform (there are custom processing modules for Windows, OS/2, BeOS, and NetWare, plus the standard set for Unix and its variants). Compared to CVS which forks multiple processes each time you connect (2 via pserver, or 1 SSH + 1 cvs via ssh), Apache just has processes/threads waiting for processing.
Last up the stack is caching proxies. Since the network protocol is based on HTTP, and we have made use of WebDAV/DeltaV, we have a firm specification on how to mark resources as permanently cachable. For large sites like SourceForge, where a huge amount of the traffic is anonymous checkouts, having reverse caching proxies "in front of" the Subversion repository will offload a lot of the work of delivering basic content down to the user's machine. Workgroups can also install caching proxies at their network edge (or even within their department) and access remote repositories through it. The first guy in who does an "svn update" will prefill the cache, making updates for the next people operate at LAN-speed. For geographical disperse teams, this will be a big win.
Please note that Subversion is designed as a set of layered libraries. We are also implementing language bindings to those libraries (via SWIG). At the moment, we have Python bindings to most of our libraries.
This scripting support works from the low-level access to the versioning database, all the way up to the high-level client library. (the command-line client is just a thin app glueing stdin/stdout/stderr and the cmdline to the libsvn_client library)
The command-line client app also ensures that its output is machine parsable.
You want scriptability? Hoo. Subversion has it.
[ I'm just waiting for somebody to take the Python GTK bindings and the bindings to the client library to build a cool GUI app ]
You don't need to run a separate SSH tunnel (or even have an account on the server) to run Subversion securely. Since Subversion uses HTTP, we simply use SSL for our authentication and encryption.
:-)
For example:
$ svn co https://mysite.example.com/repos/project/
Nothing hard about that
Nope. The network protocol is entirely different.
That said, we've had a couple people interested in writing a "proxy" to handle CVS requests and map those to the Subversion repository (or to the protocol and a remote repository). I can't see that it could allow full interop, but it should be able to support checkouts and updates (and other readonly operations).
Sorry... SVN is short for Subversion. :-)
(the command line client is "svn" and we use "svn" in conversation)
By conflict, I presume you mean "what happens if you check out, make a change, go to commit and find somebody else has modified a file after your checkout." I can't answer for arch, but Subversion is like CVS: the commit simply fails. You need to update your working copy. If the changes to the conflicting file cause (patch) conflicts with your edits, then you'll need to resolve them before committing.
.rej file rather than inline with the code. You have to remove that .rej file to let SVN know that you've handled the conflict before a commit can proceed. (this prevents the cases where people commit conflict markers into CVS).
SVN is a bit nicer than CVS, though, in that the conflicts are stored in a
And yes: in Subversion, our revision numbers are per-commit; files do not get their own numbers. 'svn log' can show you the log messages for all commits (no need for ChangeLogs or cvs2cl.pl any more; well, unless you want to ship a changelog, in which case you do 'svn log > ChangeLog').
Subversion was designed to use Apache 2.0 as the network server. Among the many benefits are optimized network usage, conforming to the platform's best use of threads/processes, logging, monitoring of child processes, effective resource usage and recovery, etc. We fully intend SVN to be "blindingly fast" as you put it. No forks, no thread creation necessary on the server: Apache is ready and waiting for that request to come in :-). Mix that with on-the-wire compression, and diffs in both directions, and things ought to zip along quite nicely.
.log files, you won't lose any information when you restore from backup + mirror. And with atomic commits, the client won't mark the working copy as "committed" until the server says it was. So if it goes down partway through the commit, just do it again after recovery.
Regarding backup/replication: since SVN is based on BerkeleyDB, we get its "hot backup" feature. You can do a full backup without stopping readers or writers. Presuming that you are mirroring the Berkeley
(and note that we get all of Berkeley's replication, recovery, backup, blah blah blah tools)
It may be interesting to note that you can do an "svn commit" to check in a change to a .html file and have it immediately appear on your web site. In fact, SVN uses a URL to specify the repository to check out. That URL can be your website. For example:
:-)
$ svn checkout http://mysite.example.com/ -d site
$ jed site/index.html
$ svn commit -m "more tweaks" site
Your tweaks are immediately published.
(of course, it sounds like you want a staging server in there, and some kind of workflow, but that can be done and is an exercise for the reader...
Jim,
:-)
I hope you were referring to *his* website.
Yours is far from silly... I read the site every day or two. It is a fantastic site for information about the kernel.
(although I wish Myrdall would comment a bit more rather than simply say "updates" all the time... I mean "duh"... tell us what happened
Please keep up the excellent work. I might even suggest that you think about how you can delegate portions to other people. At least to the extent that you can say, "Hey John Doe, could you figure out and implement how I can automate XYZ?" I bet you would get several takers, myself included.
Cheers,
-g
damn. just re-read the release. I misinterpreted the thing on my first reading... Seems like they probably will count on the commercial shipments, rather than all installs. sigh.
They're probably going to need to revise their strategy, though, given Linux's "buy once. install many" strategy. That just isn't seen for the other operating system.s
Regardless: keep watching IDC.
Cheers.