Ease Into Subversion From CVS

Tip by aePrime · 2004-03-07 16:14 · Score: 3, Insightful

Remember, when changing software components, it's a good idea to back up first!

Re:Tip by Anonymous Coward · 2004-03-07 16:53 · Score: 1, Funny

Ahh, it was a joke. Evidently not a very funny one.
Re:Tip by drinkypoo · 2004-03-08 07:25 · Score: 1

Remember, when driving your car out of the garage, it's a good idea to open the door first!
Remember, when you are going to take a leak, it's a good idea to pull your wang out first! (Assuming you're male, of course. Disclaimers are the second order of business around here, right after frist posts.)
Remember, when you are going to spread peanut butter on bread, open the peanut butter jar first!

--
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"

Re:Summary? by QuantumG · 2004-03-07 16:29 · Score: 0

The article is a summary.

--
How we know is more important than what we know.

Is there demand? by cookiepus · 2004-03-07 16:29 · Score: 4, Interesting

I've read the linked article (really!) and I think Subversion sounds like a good idea. Primarily, I like the fact that everything you can do with CVS you can do with Subversion in the same way as with CVS.

I am really curious how much demand there is for Subversion's new features, however.

Do developers out there voice the need to store binaries? I can imagine this being needed for web developers and such, but I think programmers can just build their binaries from CVS.

Also, have there been many problems that required atomic commits? Can someone explain why this is important? I mean, the idea is you'll need to merge one way or another. I can see the point being in that what you commit at any given time will compile (presuming you're commiting completed code) but realistically, does anyone not fix their up-to-date checks as soon as they happen?

Also, Subversions says that it is much faster at things like tagging, but tagging is not a very frequent operation...

To me it sounds like a great product but I am not able to see a compelling reason why most development shops out there who are currently in CVS would rush to switch.

Not a flame btw, just an opinion.

--
Ecce Europa - Web Design for Business

Re:Is there demand? by nosferatu-man · 2004-03-07 16:35 · Score: 5, Informative

We're switching. CVS is crufty, buggy, and slow. That alone is reason enough to switch, but atomic commits and faster and more transparent branching will be, in the long run, a more fundamental win.

'jfb

--
To spur "enterprise Linux," Big Bang, the distributed two-phase commit.
Re:Is there demand? by aurum42 · 2004-03-07 16:40 · Score: 4, Informative

I don't know what your development model is, but branching and tagging are often some of the most frequent (and slowest, in CVS) operations.
Many projects follow the "make branch, fix bug in branch, test branch and then merge" cycle, which makes a lot of sense.

--
"The slave who knows his master's will and does not get ready...will be be beaten with many blows."Luke 12:47-48
Re:Is there demand? by dietz · 2004-03-07 16:45 · Score: 5, Interesting

Before reading this, let the record show that I am a subversion fanboy. But I am only a Subversion fanboy because it solved almost all of my complaints about CVS. I am not involved with the project at all.

Do developers out there voice the need to store binaries?

Uh, most projects of any size will have at least a few binary files in their repository... icons, etc. But you could store those in CVS without too many problems.

Also, have there been many problems that required atomic commits? Can someone explain why this is important?

Rolling back changes without atomic commits is a pain in fucking ass. Have you ever had to do it? You have to track down every file that you changed (somehow... hopefully you can remember), check which version was the version prior to your commit, and get all those versions of files. For example "Okay, I need version 1.7 of foo.c and version 1.8 of barf.c and version 1.13 of foo.h." It's totally annoying.

Plus atomic commits just makes it much, much easier to keep track of what changes have gone it. This is my biggest, biggest complaint about CVS. File-level commits just make no sense. There is no time, ever, that I can think of when the ability to commit an entire changeset at once isn't better than committing a single file at a time.

Also, Subversions says that it is much faster at things like tagging, but tagging is not a very frequent operation...

Depends on your development process. During beta periods, it's common to make a tag or two per day, and if each tag takes ten minutes, well... it's not a big thing, but it's certainly annoying.

To me it sounds like a great product but I am not able to see a compelling reason why most development shops out there who are currently in CVS would rush to switch.

Certainly not every shop is going to "rush to switch". But, regardless, I imagine that every shop will switch eventually. It may take years, but subversion's advantages are significant enough that in my opinion it will become the new version control standard.

Also note that CVS was crufty and adding new features was almost impossible. Subversion targetted CVS features as their 1.0 milestone. But more importantly, the Subversion code base is a much better baseline to work from when adding new features. So you can expect that it will only get better in the future.
Re:Is there demand? by Endive4Ever · 2004-03-07 17:02 · Score: 5, Informative

Do developers out there voice the need to store binaries? I can imagine this being needed for web developers and such, but I think programmers can just build their binaries from CVS.

Yes, developers definitely need to store binaries. I worked on a project awhile back where the boot block code was a finished binary. Because CVS was used to house the project, a horrible kludge involving UUENCODE had to be used to store the binary commits. Sometimes the binary was created by a totally different tool that the main build machine doesn't have. In the case I speak of, the binary was built with an expensive licensed assembler for an Analog Devices DSP chip, and contained as a body of the 'build' because it was dynamically 'injected' into the dsp processor from the native processor, which happened to be an 80196.

There are always cases where a binary needs to be committed. Think about bitmaps and other resources. It doesn't make sense to 'generate them from source' every time a build is done.

Given all this, it's my understanding that with newer versions of CVS binaries can be committed safely. Is this even an instance where 'Subversion' is needed?

--
---
Re:Is there demand? by Anonymous Coward · 2004-03-07 17:29 · Score: 2, Informative

Do developers out there voice the need to store binaries?
It's a useful feature. Many companies like to store versions of binaries alongside sources. That way, if some customer has a bug with version 2.1.2.4 of Foofware, the company can just check that out, instead of figuring out (and hoping to get it right) how to build it.
And atomic commits are very useful. I wondere how CVS got so popular without them, but I think it is that people don't have them and didn't know what they were missing.
Subversion seems to be provide a lot more of the things I expect from SCM tools.
CVS seems to me to be a layer on top of RCCS. Now, I don't use either. I'm in a PhD program, and I use ClearCase LT thanks to the IBM scholar program. Sure, it's heavyweight, but I got used to it at HP and I like it. Feels solid.
Re:Is there demand? by Thing+1 · 2004-03-07 17:57 · Score: 1

Atomic commits are essential if you plan to automatically build upon every submission.
Otherwise, when a developer changes a data structure, and submits the .C before the .H, the build will break if it decides to build after the .C was submitted.

--
I feel fantastic, and I'm still alive.
Re:Is there demand? by Ninja+Programmer · 2004-03-07 22:54 · Score: 3, Interesting

Do developers out there voice the need to store binaries? I can imagine this being needed for web developers and such, but I think programmers can just build their binaries from CVS.
CVS lets you check in binaries. But it doesn't use any diff algorithm -- its just stores each instance. So its just inefficient. Any application that uses media will commonly have binary data.

The other thing is that Unicode source data is typically not stored in a purely ASCII compatible form. Moving forward, people are going to be using Unicode source data which at a low level can be considered essentially binary.

Also, have there been many problems that required atomic commits? Can someone explain why this is important?
Once you get to above about two dozen developers working on the same code base, you will end up with erroneous check-in collisions. Detecting and reversing out of these is a lot of fun.

I mean, the idea is you'll need to merge one way or another.
If you check-in mulitple files, then everything will be checked in except where there are conflicts. When you fix the "conflicts" you end up with an image that nobody actually tested. If you test it before checking in the fixes for the conflicts, then you leave the source tree exposed in a state where only part of your check in is there (and with enough developers there is an arbitrary number of partial checkins that the tree might be containing at any one time.)

These are all standard "race condition" problems. Commits have to be atomic for the same reason that transactions are atomic in databases, and mutexes/semaphores exist in operating systems.

IMHO, this issue alone is more important that all other combined.

Also, Subversions says that it is much faster at things like tagging, but tagging is not a very frequent operation...
Chicken and egg? If tagging were fast, wouldn't people be more likely to use it? Tagging is a way test people, release people, and even marketing people interact with the development results in a way that makes sense to them. Tagging is a very useful thing. Having numbered check-ins like Perforce makes this slightly less important, but why map your milestone ordinals to some homebrew scheme, when your source control can do it for you?
Re:Is there demand? by Jacek+Poplawski · 2004-03-07 23:22 · Score: 2, Interesting

CVS lets you check in binaries. But it doesn't use any diff algorithm -- its just stores each instance. So its just inefficient. Any application that uses media will commonly have binary data.

CVS stores binaries but it is not so trivial. When we put some binary data into our CVS tree we realized Windows users can't access it (need some setting in repository). CVS behaves differently in Linux and in Windows in this case.
Re:Is there demand? by spongman · 2004-03-07 23:28 · Score: 2, Interesting

Yeah, I love the fact that there's a revision number that's global to the whole repository.
We embed that number into each build of our product and our testers file bugs against a particular revision. If I can't repro a bug against my current code, I can just create a new branch at the given revision, compile, and I know I'm using exactly the same code that the tester was running.
Re:Is there demand? by 0x0d0a · 2004-03-08 02:16 · Score: 2, Interesting

That's because you checked in the binary in text format instead of binary, and the linefeed translation chewed up your binaries when switching between platforms.

This is particularly annoying with text-like formats, like Visual Studio 6's .dsw files -- they look like text files, they smell like text files, and CVS autodetects them as text files, but Visual Studio 6 throws a tantrum if you try to hand it a .dsw file with LF line endings.

--
May we never see th
Re:Is there demand? by 0x0d0a · 2004-03-08 02:22 · Score: 2, Informative

Rolling back changes without atomic commits is a pain in fucking ass. Have you ever had to do it? You have to track down every file that you changed (somehow... hopefully you can remember), check which version was the version prior to your commit, and get all those versions of files. For example "Okay, I need version 1.7 of foo.c and version 1.8 of barf.c and version 1.13 of foo.h." It's totally annoying.

Take a look at the -D flag. You'll be pleased.

I agree that CVS was almost mind-bogglingly crufty. It may be the single most crufty piece of software that I used regularly. Everything about CVS was defined by the way RCS worked, which just didn't make that much sense for a CVS-like environment.

--
May we never see th
Re:Is there demand? by Tom7 · 2004-03-08 02:55 · Score: 1

Do developers out there voice the need to store binaries?

Hell yes. There's often binary data in my projects, like graphics. cvs add -kb does work, but it doesn't work very well.

I've never felt a need for the other features listed. The main draw for me is that it can actually rename files and remove directories.

I'll use it as soon as sourceforge starts supporting it...
Re:Is there demand? by scrytch · 2004-03-08 03:15 · Score: 1

> Do developers out there voice the need to store binaries?

Hell yes. Worldforge has a media developers group that is using CVS, and they just hate it. The admin has to periodically go through and sweep out old media file versions because they're simply too big to keep all of them.

> Also, have there been many problems that required atomic commits? Can someone explain why this is important?

Very simple. If I change several files, and the changes depend on each other (happens every time one changes an API for instance), I damn well don't want one file changed and the others not changed if there's some problem -- I want it so my entire changeset can go through at once or not at all. CVS is probably the last SCM system in wide use now that doesn't support any notion of changesets. Right now in CVS, one usually ends up having to branch if they want their code appearing in the repository so others can work with it, then it has to get merged later while the project goes into a freeze. Meanwhile, conflicts just pile up in branches. This is no substitute for changesets.

> Also, Subversions says that it is much faster at things like tagging, but tagging is not a very frequent operation...

Says you. I do it literally every single day. It's called a daily build.

--
I've finally had it: until slashdot gets article moderation, I am not coming back.
Re:Is there demand? by an_mo · 2004-03-08 03:48 · Score: 1

how about versioning directories?
Re:Is there demand? by Textbook+Error · 2004-03-08 04:09 · Score: 3, Interesting

if some customer has a bug with version 2.1.2.4 of Foofware, the company can just check that out, instead of figuring out (and hoping to get it right) how to build it

Your build system is seriously broken if this is the case. The whole point of revision control is that you can get back to a previous build just by fetching a specific tag or branch. If that means that you need to keep your entire dev environment (IDE+tools straight off the CD, headers, runtime libraries, etc) under revision control then that's what you should do.

Builds have to be deterministic if you want to have reliable QA, and making the build process reproducible is at least as important as using source control. The alternative is you end up checking out a build from 6 months ago that crashes, yet when you try and build the equivalent source the crash goes away. Having to say "um, this should be the same build but this one works and that one doesn't and I can't tell you why" is a sign that something pretty serious has gone wrong in your process.

There are plenty of other good reasons to keep binary data in a revision control system (images, sound, models, data for regression tests, materials for installers, etc) but trying to avoid having to have a deterministic build process shouldn't be one of them.

Third party libraries that you never build yourself can obviously be checked in as-is, but anything that you build from source should always be buildable from source on a brand new workspace. No ifs, no buts - if you can't produce a reliable build on demand, how do you know what's going into any of your builds?

--

Nae bother
Re:Is there demand? by tigeba · 2004-03-08 06:59 · Score: 1

"Do developers out there voice the need to store binaries? I can imagine this being needed for web developers and such, but I think programmers can just build their binaries from CVS."

Personally, I am working on a game, and there are tons of binary format files (textures, models) While many of these files can be generated from source so to speak, it would not always be practical. In addition, they take up a non-trivial amount of space, and the binary diff feature could help with this.
Re:Is there demand? by cookiepus · 2004-03-08 12:49 · Score: 1

Good point, I didn't know people do that.Why build upon every submit? To see if what you've got in the repository is compilable at all times?

--
Ecce Europa - Web Design for Business
Re:Is there demand? by ndykman · 2004-03-08 21:13 · Score: 1

Agreed that a deterministic build process is good, but by storing binaries alongside the source, that verifies that the build for version 2.1.2.4 of Foofware builds exactly the binary in the SCM system. And, if by misfortune it doesn't, you can dig around and see where the differences might lie.

Also, there is the case when you just want a particular configuration. Let's say you want version 1.2.4 of your SuperOS to be installed to reproduce a bug for a customer. The bug is not in the OS, but you want to reproduce the environment quickly.

A build could take a long time, and you may not have the resources to exactly recreate what the customer has. But, if you have the binaries, you just check them out and load them.

You should be able to rebuild on demand, but that's not always possible. Certainly, it is to be desired (and demanded for recent versions), but if you can't even produce the binary version of software, then you really have a problem.
Re:Is there demand? by Thing+1 · 2004-03-09 17:22 · Score: 1

Yep. It's something of a corrolary to the "many eyes == shallow bugs" theory; the more builds you do as a function of submissions, the more easily you can pinpoint where the break was submitted.
Especially if it's not a compile error, but something found during regression testing. The build engine should email (or otherwise notify) the regression test suite machines upon a build's completion, which should then kick off the automated regression tests, emailing results to the team and managers when done.
Yes, I do this for a living. ;-) One great free tool is Tinderbox, which you can find at the Mozilla.org site. I'm currently trying to get it to understand ClearCase so if there's anyone out there who has already done so, I'd appreciate a few pointers. (ClearCase was not my decision, I come from a Perforce background, but ClearCase is what my current employer standardized on and they've got it integrated with ClearQuest for bug tracking; changing it involves more political clout than I currently have.)

--
I feel fantastic, and I'm still alive.
Re:Is there demand? by wideBlueSkies · 2004-03-10 03:26 · Score: 1

>>Do developers out there voice the need to store binaries?

Yup.

My project is versioning all our documentation. So that means Word and Excel files. Right now, we're doing it in Visual Source Safe (which I hate), but at the project start there was no alternative.

Also, on the development side, there are binary files: images, report definition files, vendor libraries... lots of stuff.

wbs.

--
Huh?
Re:Is there demand? by ahdeoz · 2004-03-10 09:31 · Score: 1

I don't know if I would trust storing diffs of binary files (like media) in version control. It seems better to have them complete. (being able to diff, ie. checksum, is different though. I can't think of a reason, except what if there's a bug? But my real reason for commenting is to say, thanks for the sig.

Re:Summary? by Hamster+Of+Death · 2004-03-07 16:30 · Score: 2, Informative

See the project front page
Subversion

Windows server? by Xtifr · 2004-03-07 16:42 · Score: 1, Interesting

It's nice that you can run a subversion server on MSWin server systems, I suppose, if that's the sort of thing that floats your boat. But how on earth is the option to spend hundreds of extra dollars on proprietary operating system software and the more-expensive hardware it requires "significantly lower[ing] the barrier to entry?"

There may be a minor barrier, in Win-only shops (although I would say that it's the Win-only policy that is the barrier, not the other way around). Like I say, Win support is a perfectly nice thing. But "significant"?

Re:Windows server? by gears5665 · 2004-03-07 17:09 · Score: 1

Cost isn't an issue.

In a Microsoft Shop developers will use Microsoft SourceSafe. period. Subversion doesn't have a chance to compete because there is absolutely no way that it can integrate fully into the .Net development tools the way Microsoft's Own Source Storage Software is designed to do.

And to be honest...it's not that expensive for one copy per shop.
Re:Windows server? by Anonymous Coward · 2004-03-07 17:18 · Score: 1, Informative

You can't be serious. Most serious shops with large development teams, like as over 50 programmers use other source control software. Very few mid size firms use SourceSafe, which sucks big time. Even hardcore MS people I know say it sucks big time.
Re:Windows server? by Eneff · 2004-03-07 17:48 · Score: 4, Insightful

How about individuals wanting source control on their at-home projects? I'm sure not going to spend the money on the MS control, but I don't have a *ix box up 24/7 either. (I use my laptop nearly exclusively, and my laptop hardware supports Windows better.)
Re:Windows server? by ogre57 · 2004-03-07 18:36 · Score: 2, Informative

In a Microsoft Shop developers will use Microsoft SourceSafe. period.

No, they won't. Can think of several shops/teams using PVCS, plus a handful on other products, but none using MSS. Up front (purchase) cost isn't much of an issue. Time cost (TCO) very much is. MSS is simply much too slow to be competitive.
Re:Windows server? by ajagci · 2004-03-07 19:50 · Score: 1

But how on earth is the option to spend hundreds of extra dollars on proprietary operating system software and the more-expensive hardware it requires "significantly lower[ing] the barrier to entry?"

Sunk costs: people have already paid for Windows, and they have already invested in training for it. But if you can get these people to move to OSS for some things, maybe you can get them to switch for others as well.
Re:Windows server? by cyborch · 2004-03-07 20:55 · Score: 2, Informative

Also, and much more importantly: MSS only does file locking - not merging file content. It can hardly be called a "real" versioning system.
Re:Windows server? by Adrian · 2004-03-07 22:32 · Score: 3, Informative

In a Microsoft Shop developers will use Microsoft SourceSafe. period

Not in my experience. Some do and some don't. The absence of pain not using VSS can supply compensates for the lack of tool integration. Even MS doesn't use VSS internally ;-)

Subversion doesn't have a chance to compete because there is absolutely no way that it can integrate fully into the .Net development tools the way Microsoft's Own Source Storage Software is designed to do.

I think the people writing the Subway and sourcecross subversion-SCC interfaces might disagree with you there.
Re:Windows server? by spongman · 2004-03-07 23:11 · Score: 4, Informative

I'm runing svnserve on a windows box in a production environment and it works great.
If you want to start svnserve as a windows service, google for srvany.exe, it allows you to run a regular win32 exe as a service.
Re:Windows server? by spongman · 2004-03-07 23:19 · Score: 2, Informative

i should add: I'd definitely recommend installing TortoiseSVN. Having the SVN operations available as a shell extension is a godsend. For example you can use SVN from within any FileOpen dialog. The only thing it's missing is a directory-diff, but on XP you can show the SVN status of files in explorer by configuring the attribute columns in the details view.
Also, I'd recommend downloading perforce's p4win 3-way merge tool. It's a little better than the one built into TortoiseSVN.
Re:Windows server? by an_mo · 2004-03-08 04:03 · Score: 1

... and if you want to use it with ssh on top of svnserve you can read this step by step guide
Re:Windows server? by Anonymous Coward · 2004-03-08 04:09 · Score: 0

Perforce has a free two-user version. Use it until you grow. Then, pay if things get to that point. Or, if you get to the point you want support. Their support is excellent, BTW.
Re:Windows server? by alienmole · 2004-03-08 04:45 · Score: 1

On the contrary, for a long time replacing Sourcesafe with CVS was a major way that I managed to sneak Linux into Windows shops (before it became easy to set up CVS servers on Windows). Windows developers who made the switch loved it. Nowadays, they use Linux for all sorts of other reasons, and there's even less reason for them to stick to inferior Microsoft tools.
Re:Windows server? by drinkypoo · 2004-03-08 07:31 · Score: 1

Jalindi Igloo, CVS SCC plugin for Microsoft Visual Studio and other compliant IDEs
CVSIn, CVS Integration Add-in for Microsoft Visual Studio
CVS SCC proxy is the SCC API plug-in which provides access from practically all Microsoft SCC enabled software to the general CVS repositories.

I don't know how well any of them work, one page asserts that igloo is crap and the SCC proxy is the holy grail. I have no idea if any of this is true, just googling a litle.

--
"You're right," Fisheye says. "I should have set it on 'whip' or 'chop.'"
Re:Windows server? by Xtifr · 2004-03-08 09:25 · Score: 1

If cost isn't the issue, then what is? For MS-only shops, the issue is the MS-only policy itself, and that's just a policy. It's purely voluntary, I don't see how a voluntary policy counts as a significant barrier. A barrier, yes, but significant?

As for using MS-branded tools only, that's irrelevant, as the ability to run subversion servers on windows doesn't do anything to that barrier. Perhaps you didn't read the article?
Re:Windows server? by Xtifr · 2004-03-08 09:57 · Score: 1

Why would you want to use CVS in server mode if you're not using multiple boxes? AFAIK, CVS runs just fine in standalone mode on Win. And for "at-home" projects, that should be sufficient. There's other alternatives too, like external hosting: sourceforge and whatnot.
Re:Windows server? by Anonymous Coward · 2004-03-09 08:27 · Score: 0

The biggest Microsoft shop of them all (Microsoft) doesn't use source safe. They use perforce.

Most of the commercial offerings have good integration with MS (and other) tools. No reason subversion couldn't.
Re:Windows server? by jovlinger · 2004-03-09 10:52 · Score: 1

yup.

I worked with VSS and Visual Age. A check in went like this:

1) lock entire VSS source tree
2) export VSS onto your HD.
3) export your Visual Age source ontop of the VSS source tree
4) use another versioning tool to tell you which files you've touched.
5) check only these files back into VSS
6) unlock.

A complete PITA, basically because we were using two repositories in parallel. Every so often, they would fall out of sync, or someone would get the order wrong and go backwards in time.
Re:Windows server? by mgm · 2004-03-09 11:37 · Score: 1
If cost isn't the issue, then what is? For MS-only shops, the issue is the MS-only policy itself, and that's just a policy. It's purely voluntary, I don't see how a voluntary policy counts as a significant barrier. A barrier, yes, but significant?

As for using MS-branded tools only, that's irrelevant, as the ability to run subversion servers on windows doesn't do anything to that barrier. Perhaps you didn't read the article?

I wrote the article. My reasoning goes something like this:
- The CVS server, in my experience, doesn't run very well on Windows. Last time I tried CVSNT, it hung randomly whilst importing 200 megabyte source trees. I threw the same job at Subversion 0.30.0 (six months ago) and it worked like a charm. I conclude from this and previous experience that a CVS server is better off on Unix.
- Subversion servers work great on Windows. It's the same source code, packaged as Windows binaries and released by the Subversion development team. Thanks largely to the Apache Portable Runtime (APR), Windows isn't a second-class hosting platform for Subversion.
- There are plenty of "Windows only" shops, for a variety of reasons (experience, training, policy, sunk costs, inertia). Suggesting such shops should not be Windows only is kinda silly -- they just are, I often have to live with it when working on a project. Introducing Subversion to an organisation like that is as simple as finding an under utilised desktop and installing a Subversion server.
- Getting a Subversion server up on a Unix machine is often a lot of work, because those machines tend to (not always, sure) have more red-tape surrounding them. Installing a random server isn't many admins cup of tea.
I conclude that having Windows fully supported as a first-class server platform for Subversion significantly lowers the barrier for entry for someone introducing Subversion to an enterprise. From a personal point of view, it also makes it easier for me to run a Subversion server on a laptop, where I'm running Windows.

All your files are belong to us by wayne606 · 2004-03-07 16:50 · Score: 4, Interesting

It bothers me a bit that all the files are now in a big database. A good thing about CVS is that you can see what files and modules are available using regular unix tools, and if things get messed up in some way you can always fall back to the rcs commands or in the worst case edit the ,v file by hand and extract the latest version. With a database, if things were to get corrupted enough (I have no evidence that this happens often, but still...) you are stuck. Just like with the windows registry, where if it gets messed up you lose big.

Any opinions on this?

Re:All your files are belong to us by gears5665 · 2004-03-07 17:12 · Score: 1

well...the data in the database has to exist on disk somewhere...probably a raw encrypted file somewhere...I imagine a backup of this file(s) will allow it to be used in another version of the software...I can't really tell without more research.
Re: All your files are belong to us by dubhead · 2004-03-07 17:37 · Score: 0

Frequent backups or dump command might help you.

Some seasoned CVS users hack the RCS files to achieve file renaming. SVN users don't have to, because SVN supports that function.
Re: All your files are belong to us by zatz · 2004-03-07 17:42 · Score: 1

"Hack" the RCS files? mv(1) hardly qualifies as hacking.

--

Java: the COBOL of the new millenium.
Re:All your files are belong to us by Anonymous Coward · 2004-03-07 18:35 · Score: 2, Informative

Not only is it in a database, it's in a Berkeley DB. Some thoughts on this:

1) there is absolutely nothing about a version control system that requires a key/value database like berkeley DB. I think they just use it to get free locking and transactions. Strange.

2) berekeley DB is ultra-sensitive. Ever had to deal with a locked Berk DB, when no process was running that had it locked? You have to manually break the locks. Fun. This hasn't happened to me with subversion (yet), but I expect it to be a problem.

3) the *filesystem* already gives you atomic operations and so forth. They could've used that, and then written a thin compatibility layer for windows, which doesn't have posix filesystem semantics.

*grumble* *grumble* overengineering *grumble*
Re:All your files are belong to us by magnum3065 · 2004-03-07 18:49 · Score: 4, Informative

Someone else already mentioned the ability for live backups with Subversion. Another benefit of the database is built-in journaling support. BerkelyDB logs any changes before making them, so if your system crashes or something, the DB will be restored to a stable point. This is MORE reliable than what CVS offers, even with a journaling filesystem. Also I'm pretty sure that if you REALLY need to hack the DB, there are utilities that will let you do this. However, most of the scenarios that CVS admins needed to hack the ,v files for are no longer a problem in Subversion.
Re:All your files are belong to us by nthomas · 2004-03-07 19:05 · Score: 5, Insightful

It bothers me a bit that all the files are now in a big database.
When you used PostgreSQL, MySQL, or Oracle, does it bother you that your data is in a big database? Why do you worry so much about Subversion then?
A good thing about CVS is that you can see what files and modules are available using regular unix tools, and if things get messed up in some way you can always fall back to the rcs commands or in the worst case edit the ,v file by hand and extract the latest version.
It is a good thing that you were able to hand-edit CVS repositories when they got corrupted -- because corrupt CVS repositories are a dime a dozen.
I've been using Subversion since January 2002 (yes, a full two years before 1.0 came out.) and I have never, ever, ever seen a corrupt repository or heard about one on the mailing lists. When someone did claim that they thought Subversion corrupted their repositories, the Subversion devs dropped everything to make sure this wasn't the case. AFAIK, it has never happened. (Usually it was the person using multiple servers to access their repo or putting their repo on a network share (Berkeley DB doesn't work over NFS/AFS/CIFS.))
Let me quote a Slasdot posting of mine from a couple of years ago:
...there is nothing that the dev team values more than the integrity of your data. Nothing. This means that once something has been comitted, it will never be lost.
My opinion has not changed in the past two years.
Thomas
Re:All your files are belong to us by Adrian · 2004-03-07 21:28 · Score: 2, Informative

With a database, if things were to get corrupted enough (I have no evidence that this happens often, but still...) you are stuck. Just like with the windows registry, where if it gets messed up you lose big.

I worry more about disk crashes and accidental deletions. This is what backups are for ;-)

You can also serialise everything into a fairly human readable file to with svnadmin dump and svnadmin load if you feel you need something non-binary.

Really not a problem as far as I'm concerned.
Re: All your files are belong to us by Anonymous+Conrad · 2004-03-08 02:25 · Score: 1

"Hack" the RCS files? mv(1) hardly qualifies as hacking.

There's more to it than that. If you have two development branches and you mv(1) the ,v file then you've renamed the file on *both* branches, and that's usually not what you want to do.

The hack is to copy the ,v file to the new name and then edit the branch/tag/liveness state in the files so the rename only happened on the tags and branches where you wanted it to but so you've still got as much history as possible.
Re:All your files are belong to us by scrytch · 2004-03-08 03:08 · Score: 1

> It bothers me a bit that all the files are now in a big database.

You think a filesystem isn't a database? It bothers me more when all the files are on an ext2fs filesystem; hope that UPS has been checked recently. Perforce uses a database as well (in fact it's the same, berkeley db or some *dbm), and I've never heard of it eating a repository. Being able to change the db backend for subversion would be nice though. In fact I'd consider it pretty damn critical for any organization-wide SCM repository, since I'd want replication (read-only of course, I'm not that masochistic).

Welcome to slashdot btw. All your comments are in a database. I'm not sure how much of a case that makes for a database however...

--
I've finally had it: until slashdot gets article moderation, I am not coming back.
Re:All your files are belong to us by mgm · 2004-03-08 05:33 · Score: 1

Perforce used to use BDB, yes, but around 18 months ago they switched to a home-grown C++ DB of some sorts. Doing so required that it perform a dump/reload cycle (just like Subversion does when a major database upgrade is required -- think "maybe 2.0 and a long way off" in Subversion's case). This is yet another Subversion-is-like-Perforce moment where I get nice fuzzy feelings about Subversion. Perforce is database backed, with plaintext dump files, repository-wide revision numbers, cheap branches, and I still think it's the best system I've used. It's not free, of course...
Re:All your files are belong to us by thelenm · 2004-03-08 08:09 · Score: 2, Informative

I'm not sure that being able to edit the ,v files by hand is an advantage of CVS. If anything, I see it as a disadvantage since: a) you're making changes "behind the system's back"; and b) it's easy to screw up.

The face that Subversion uses a Berkeley DB file backend doesn't mean you're hosed in case of problems, especially if you've been backing your data up. You can make a live backup anytime you want - with every commit, if you're paranoid. It's also possible to dump any or all commits to a human-readable format that can also be used to restore. But usually you won't even have to muck around with restoring from backup - if the repository gets wedged somehow, try 'svnadmin recover' and it will usually solve the problem.

There's a nice chapter in the Subversion online book that deals with all this stuff.

--
Use Ctrl-C instead of ESC in Vim!
Re:All your files are belong to us by ray-auch · 2004-03-08 08:14 · Score: 2, Interesting

Personally, all the data in Oracle, (SQL Server even) or PostgreSQL wouldn't bother me, MySQL might worry me a little, MS Jet / Access worries me a lot. BerkleyDB I'm not sure about, I know a little of its heritage on unix but would be a lot less sure on other platforms.

A lot of people's experience with source control and DBs will be coloured by Visual Source Safe and Jet (which it uses). It is ok until it gets corrupted, and then you are hosed. Keeping everything in readable files CVS-style is a BIG plus point once you've been in that situation.

I'm confused on your corruption statement - you seem to say both that it never happens, and that subversion never does it but other things ("Usually it was the person using multiple servers...") do. Which is it ? And if the latter, what recovery options are there ?

I am also wary of database-based products which are tied to one particular database - makes me worried there are low level hacks being relied on. I think a lot of people (well, me for one) would like to run _one_ rdbms on _one_ db-optimised server managed by _one_ dba - not a dozen different ones all over the place which all have to be managed differently (backup Oracle here, backup Exchange (yuk) here, backup MySQL here for appY, backup SQL here for AppX and now add another special here for source control...).

With stuff in one rdbms it is also easy to relate stuff together in queries (query source control operations related to versions in a trouble tickts app, for example).

If it supported multiple (at least two) rdbms from the outset configurable via odbc/jdbc/etc., preferrably also with an open schema and "just use sql like this to get file x version y from project z" - then it would give me (for one) far more confidence that it was worth looking into further.

PS. I haven't had sourcesafe (still have to use it for some stuff) corrupt a db in over the past two years either - the horror of seeing >5yrs of the whole team's code history suddenly inaccessible (shortly after tape drive problems...) stays fresh in your mind a lot longer.
Re: All your files are belong to us by aled · 2004-03-08 09:50 · Score: 1

Do you mean that because the "workaround" (if you don't want to call it "hack") is "easy" is better than implement the right functionality?

PD: Java isn't.

--

"I think this line is mostly filler"
Re:All your files are belong to us by empty · 2004-03-08 11:06 · Score: 4, Informative

...It is ok until it gets corrupted, and then you are hosed. Keeping everything in readable files CVS-style is a BIG plus point once you've been in that situation...
...I am also wary of database-based products which are tied to one particular database...

Subversion has a utility that might assuage your fears:
svnadmin dump
The dump command can do a (full or incremental) dump of your repository such that you can completely recreate its history. If you use this command for backup, you will be assured that you don't lose any data.

As a bonus, the dump file is human readable, so there should be no fear of losing data to an inscrutable binary file.
Re:All your files are belong to us by scott_davey · 2004-03-08 20:40 · Score: 1

A lot of people's experience with source control and DBs will be coloured by Visual Source Safe and Jet (which it uses). It is ok until it gets corrupted, and then you are hosed. Keeping everything in readable files CVS-style is a BIG plus point once you've been in that situation.

We are just migrating now from Visual Source Safe to CVS for this very reason. Unfortunately for us, one day we started checking in files, and the new files and the entire history all became lots of blank files. Time to get an open solution...

I took a look at Subversion, and I hold high hopes for it. But CVS's use of the filesystem is a big plus over any closed (vss) or difficult-to-hand-recover (BerkeleyDB) storage medium. And it seems more stable and more accepted, too.

If I was comparing revision control systems feature-to-feature, I would have probably chosen subversion. But because of my recent data loss with VSS, I just want something *safe*.

In six months time when I hit the barriers with CVS, I'll take another look at Subversion...
Re:All your files are belong to us by jekewa · 2004-03-09 03:31 · Score: 1

Another thing to consider is that it's pretty easy to set up simple database replication if you wanted to have live backups of the data. This could relieve you of the discomfort that comes with database loss, as well as provide the additional warm fuzzies of distributing the work to diverse groups or safely beyond a firewall.

--
End the FUD
Re:All your files are belong to us by larry+bagina · 2004-03-10 07:42 · Score: 1

Perforce offers a free 2-seat version (i think it used to be 1 seat), which is suitable for individuals. The tools are available on common platforms (x86 linux, freebsd, windows, mac). If you're using UnixWare you're SOL :-), but that's closed source software for you.
They also give free licenses for open source projects. There are some restrictions and some paperwork to fill out. I believe MySQL uses p4.

--
Do you even lift?
These aren't the 'roids you're looking for.
Re:All your files are belong to us by ahdeoz · 2004-03-10 09:56 · Score: 1

I've been using CVS since 1996 and have never had a corrupt repository. How's that for anecdotal proof?

Some answers by magnum3065 · 2004-03-07 17:30 · Score: 5, Informative

Ok, I saw some questions about why people should switch from CVS to Subversion. The article does a nice job of covering what features Subversion adds, but people still seem to wonder why these are important.

Atomic Commits:
As stated in the article, if something goes wrong in the middle of a CVS commit (e.g. network goes down) it can leave the commit only partially complete. This can be a problem if changes in multiple files are dependent upon each other. Say I add a function to an API, then call it in other file. If the call gets committed and the API change doesn't, now the code in CVS won't compile. With atomic commits if the connection was dropped the commit would simply roll back. Then when my network came back up I could try to commit again, but the repository would never be left in a state where it didn't compile.

Constant Time Tagging/Branching:
In Subversion tagging and branching are fundamentally the same, they're both executed as a "copy" command. I'm not sure what the execution time is for these operations in CVS, though I believe it's linear to the size of the repository. In Subversion this is an O(1) operation. While one of the posts commented on tagging being an infrequent operation, this may be true, but why not let it be fast anyways? However, no matter how often you do tags, constant time branching is nice. I can at any time quickly create my own branch of a project to work from. Working in my own branch means that I can keep very granular track of my changes by committing frequently, without worrying about breaking something else. Once I'm satisfied with my changes I can merge my branch with the main code.

Storing Binaries:
"Binaries" does not necessarilly mean compiled code. There are plenty of things that can benefit from this. Anywhere you use graphics: web programming, GUI programming, or say game or other 3D programming andy you want to store your models. Or, you can store documentation in the repository: PDFs, Word docs, spreadsheets, etc.

Finally, the barrier to switching isn't all that high. The command line program has quite similar syntax, so switching is pretty easy, and the other interfaces such as the web viewer, TortoiseCVS, and IDE integrations generally have counterparts for Subversion.

Well, that's all I can think of for now. I'm actually going to try to get my company to switch over to Subversion from a commercial software they were using when we start on our new product. We're using a Java applet to interface with the repository now, and it's not very nice. CVS would work, since the main thing I want is integration with Eclipse and IntelliJ Idea, but there are plugins to support this with Subversion as well. However, Subversion has nice feature CVS doesn't, so I don't see any reason to use CVS over Subversion.

Live backups, baby by dFaust · 2004-03-07 17:33 · Score: 5, Insightful

This is a valid point, one that has crossed my mind in the past. But consider how many databases are out there in the world. Many with incomprehensible amounts of data. Given this, stability is obviously a number one priority to users and developers of databases, and certainly something that was considered before the Subversion folks a) chose to use a database backend and b) chose BerkeleyDB. Subversion has been self-hosted (they used Subversion for their source control) for over a year, and have yet to lose any data. While a year isn't that long, it's a start.

But using a database DOES provide advantages, as stated in the article. Mostly speed advantages, but also the ability to do live backups. If you try backing up an online (as in live) CVS server's files, there's nothing stopping people from doing commits, thus possibly botching your backup (you're no longer backing up the files you thought you were).

And when it comes down to it, backups are really where your safety lies. In the last CVS project I worked on, the repository was hosed twice. Once due to a careless admin, and once due to the hard drive dying. While we had some down time, virtually no work was lost, largely due to our nightly backups. The fact that CVS stored its data as plain text files certainly didn't protect us.

Re:Live backups, baby by halfnerd · 2004-03-07 22:18 · Score: 2, Informative

A year?

Taken from http://subversion.tigris.org/release-history.html:

Milestone 3 (30 August 2001): Subversion is now self-hosting.

there's over two years between that, and their 1.0.0 release, without *any* data loss.
Re:Live backups, baby by FattMattP · 2004-03-08 04:56 · Score: 1

and have yet to lose any data
How do they know? Have they checked out old data and compared it to backups?

--
Prevent email address forgery. Publish SPF records for y
Re:Live backups, baby by CoolVibe · 2004-03-08 09:02 · Score: 1

And when it comes down to it, backups are really where your safety lies. In the last CVS project I worked on, the repository was hosed twice. Once due to a careless admin, and once due to the hard drive dying. While we had some down time, virtually no work was lost, largely due to our nightly backups. The fact that CVS stored its data as plain text files certainly didn't protect us.
A non-issue on FreeBSD-5. Why? Filesystem snapshots. You just make a snapshot before you back up. Then back up the snapshot. When you're done, destroy the snapshot. Background fsck on FreeBSD-5 works the same way.

Be careful! by shfted! · 2004-03-07 17:55 · Score: 1

Be sure to talk to your programmers before you pull the switch on them. Not telling them would be rather subversive...

--
He who laughs last is stuck in a time dilation bubble.

Consider GCC by devphil · 2004-03-07 18:09 · Score: 5, Informative

Once a week, a snapshot release is made. That means a tag is added. This operation takes, on average, 40 minutes, because the GCC source tree is large.

Every time someome makes a branch, they create a tag just before branching (for use later on, with diffs and merging). 40 minutes to tag, another 40 minutes to branch.

All because these are, stupidly, O(n) operations instead of O(1). We'd like to move to Subversion, but can't, until they get annotate ('svn blame') fully working, because GCC developers spend a lot of time doing "revision-control archaeology".

--
You cannot apply a technological solution to a sociological problem. (Edwards' Law)

Re:Consider GCC by nthomas · 2004-03-07 19:16 · Score: 5, Informative

We'd like to move to Subversion, but can't, until they get annotate ('svn blame') fully working, because GCC developers spend a lot of time doing "revision-control archaeology".
Just curious, 'svn blame' was added 2003-10. What about it is not working for you?
Thomas
Re:Consider GCC by devphil · 2004-03-08 04:21 · Score: 2, Interesting

The person who tried it reported it wasn't working for certain branches off the main trunk. *shrug* Haven't tried it personally since the 1.0 release.

--
You cannot apply a technological solution to a sociological problem. (Edwards' Law)
Re:Consider GCC by Anonymous Coward · 2004-03-13 13:03 · Score: 0

If you really wanted to switch, you would have tried it by now, and if needed, given feedback to the Subversion team. I guess 80 minutes isn't much to you.

database is a dependency by ajagci · 2004-03-07 19:44 · Score: 1, Interesting

The problem is that putting the stuff into a database creates another dependency on a non-trivial piece of software. That creates all sorts of risks.

On UNIX/Linux systems, the file system is more than sufficient for handling this kind of storage and transactioning, so this dependency and risk is unnecessary.

I suspect Subversion uses a database because it may be intended to run on operating systems with less powerful file systems.

Re:database is a dependency by deKernel · 2004-03-08 00:22 · Score: 2, Insightful

You really need to explain the statement:

I suspect Subversion uses a database because it may be intended to run on operating systems with less powerful file systems.

A filesystem should not be used to hold multiple versions of a file as well as the meta-data associated with it. Less not forgeting the associations of multiple files that become a project. This is the work of a database, hence BerkleyDB. If you are concerned about "repairing" a file (aka db), there are command-line tools for just such an event, but you will probably find that you just won't ever need them. Just my 5000 sheckles.
Re:database is a dependency by ajagci · 2004-03-08 02:09 · Score: 1, Insightful

A filesystem should not be used to hold multiple versions of a file as well as the meta-data associated with it. Less not forgeting the associations of multiple files that become a project. This is the work of a database, hence BerkleyDB.

The UNIX file systems is a database. That's what it is designed to be, that's what it is used as, and that's what it is good at. It has an extensive set of tools for manipulating it and lots of excellent GUIs for dealing with it. Some current UNIX/Linux file system implementations are, in fact, implemented just like database software.

You are just mindlessly repeating what generations of Windows hackers stuck on flaky FAT file systems have told you.

If you are concerned about "repairing" a file (aka db), there are command-line tools for just such an event, but you will probably find that you just won't ever need them. Just my 5000 sheckles.

No, I'm "concerned" about having to use an entirely different set of tools to manipulate data in a DB, about DB performance for blobs, and for a Subversion installation breaking when I upgrade the DB shared library. All of those aren't theoretical problems, they actually happen in practice. And on UNIX, they are completely unnecessary problems and risks.

As I was saying, the Berkeley DB decision may make sense if the Subversion server is supposed to run on Windows or on MacOS. But as far as UNIX and Linux are concerned, it's a no-brainer: this kind of data belongs directly in the file system, not in databases.
Re:database is a dependency by TheSunborn · 2004-03-08 03:12 · Score: 1

But the filesystem(Atleast if you use posix functions to communicate with your filesystem), still lag atomic write, transactions and rollback. Ofcause you can implement all this on top of the filesystem, but then you just end of with something similary to the DB shared library.
Re:database is a dependency by ajagci · 2004-03-08 05:24 · Score: 1

But the filesystem(Atleast if you use posix functions to communicate with your filesystem), still lag atomic write, transactions and rollback.

UNIX file systems, of course, have atomic write, transactions, and rollback, they just aren't called that. Read some books on UNIX and look at how this sort of thing is handled in systems that do, in fact, use the UNIX file system as a database. For its particular application areas, the UNIX file system is so good that many relational databases use it to store blobs in (rather than the other way around).

(I don't know whether POSIX mandates the various behaviors that make the UNIX file system work, but that's another issue. We are talking about UNIX and Linux here, not POSIX.)
Re:database is a dependency by nosferatu-man · 2004-03-08 06:52 · Score: 0

UNIX file systems, of course, have atomic write, transactions, and rollback, they just aren't called that.

Tell me, sensei, how to do a multiple file rollback on a raw Unix filesystem? Or how I can ensure transactional integrity without a transaction manager? Oh wait -- you can't. The facility doesn't exist.

'jfb

--
To spur "enterprise Linux," Big Bang, the distributed two-phase commit.
Re:database is a dependency by aled · 2004-03-08 09:44 · Score: 1

"the Berkeley DB decision may make sense if the Subversion server is supposed to run on Windows or on MacOS."

Subversion is meant to be portable, I used in Windows and AFAIK there's nothing to prevent it working on Mac OS X.
I may agree that a relational database may not be the ideal database to run a versioning system but a raw filesystem seems worst to me.

--

"I think this line is mostly filler"
Re:database is a dependency by ajagci · 2004-03-08 10:38 · Score: 1

Tell me, sensei, how to do a multiple file rollback on a raw Unix filesystem? Or how I can ensure transactional integrity without a transaction manager?

With "link", "fsync", lock files, and directories. Read some UNIX source code to see how to build more complex transactional guarantees on top of that.

And, no, I'm not your "sensei". If you want an education, pay someone or at least buy yourself a good UNIX book.

Oh wait -- you can't. The facility doesn't exist.

What's lacking is not some UNIX system call, but experience and knowledge on your part. People like you go for the most complex solution because they just can't figure out how to keep things simple, and that's at the root of a lot of the problems with software today.
Re:database is a dependency by nosferatu-man · 2004-03-08 12:57 · Score: 1

Bullshit. Your "simple" solution requires AT LEAST as much conceptual overhead and abstraction as using a proper database, and, as an added nonbonus, you're stuck with the '70s era Unix filesystem semantics and security model -- and, even better, you're at the mercy of the implementors of a system that's only tangentially related to a proper database. Anyone who'd ever built even a moderately complex application on top of a Unix filesystem would blanch at your suggestion that a combination of "lock files" and Unix primitives provides anything like a universal guarantee of atomicity.

Now, if you don't need atomicity, or transactional guarantees, or any of the other goodies that a proper database provides, I agree that the filesystem can be a viable alternative. And you're also right (excuse me for putting words in your mouth here) when you claim that people are too eager to build applications on top of a full-featured database system.

Not in this case, however. svn's use of Berkeley DB is to me, perfectly appropriate.

'jfb

--
To spur "enterprise Linux," Big Bang, the distributed two-phase commit.
Re:database is a dependency by jrexilius · 2004-03-08 14:02 · Score: 1

I have to interject here, although using locks symlinks, and other hacks for POSIX based filesystems is probably a bad idea, ReiserFS is planning on adding meta-data and some other features that begin to edge into DB territory.

Although the other poster seems to have a notion of a point its not really valid with EXTn, UFS, or others.
Re:database is a dependency by ajagci · 2004-03-08 17:32 · Score: 1

Bullshit. Your "simple" solution requires AT LEAST as much conceptual overhead and abstraction as using a proper database,

It may well require the same amount of "conceptual overhead", but it has much less coupling, fewer software dependencies, much less total code, and fewer system calls for each transaction.

as an added nonbonus, you're stuck with the '70s era Unix filesystem semantics and security model

Yes: you are "stuck with" a set of time-tested semantics and a proven security model. (Of course, what any of those comments have to do with Berkeley DB, I don't know, given that Berkeley DB actually relies on file system semantics for its security.)

Now, if you don't need atomicity, or transactional guarantees, or any of the other goodies that a proper database provides, I agree that the filesystem can be a viable alternative.

Yes, you keep telling us that you yourself don't know how to do those things with the UNIX file system. What's your point?

Anyone who'd ever built even a moderately complex application on top of a Unix filesystem would blanch at your suggestion that a combination of "lock files" and Unix primitives provides anything like a universal guarantee of atomicity.

No, not "anyone", only "almost anyone". And at some point 95% of the world believed the earth was flat, too. So what? Most people don't know what they are doing and don't use their own brain, they just mindlessly repeat what they have heard. It really doesn't matter to me, and it shouldn't to anybody else, what the majority of people say. Use your own head for a change.

Not in this case, however. svn's use of Berkeley DB is to me, perfectly appropriate.

Subversion's use of Berkeley DB is appropriate not because it is needed on UNIX, but because Subversion wants to be independent of the underlying file system.

However, it is cause for concern. For example, I know that the UNIX file system can easily and efficiently handle files that are gigabytes in size (because lots of people are using UNIX file systems that way), but I have less confidence that Berkeley DB can handle many records well that are that big.
Re:database is a dependency by Joseph+Vigneau · 2004-03-08 17:56 · Score: 1

For example, I know that the UNIX file system can easily and efficiently handle files that are gigabytes in size (because lots of people are using UNIX file systems that way), but I have less confidence that Berkeley DB can handle many records well that are that big.
From Sleepycat:
Databases up to 256 terabytes
Berkeley DB uses 48 bits to address individual bytes in a database. This means that the largest theoretical Berkeley DB database is 248 bytes, or 256 terabytes, in size. Berkeley DB is in regular production use today managing databases that are hundreds of gigabytes in size.
Keys and values up to 4 gigabytes
New applications, including multimedia storage and playback systems, must manage individual data values that are large. Berkeley DB is able to store single keys and values as large as 2^32 bytes, or four gigabytes, in size.
4G/256T is big enough for most applications... FWIW, Subversion allows different backends, however no others have been written yet.
Re:database is a dependency by ajagci · 2004-03-09 12:34 · Score: 1

4G/256T is big enough for most applications...

Yes, and how many applications actually use it that way? File systems are being used that way by numerous applications every day. In fact, most databases store blobs in the file system.

FWIW, Subversion allows different backends, however no others have been written yet.

Someone should write a file system backend...
Re:database is a dependency by Anonymous Coward · 2004-03-10 07:13 · Score: 0

a few years ago, I and a handful of other PhD candidates tried to do just that. We gave up and went with a real database. Unix primitives worked fine with a single user on a linux box, but NFS and multi-clustered AIX didn't respect the lock files in all circumstances. For multiple files, there is still a small window in which dropping the connection could leave some file updated, some files not updated.
If you need something real simple, like qmail's maildir, the unix FS is good. but it's not useful as a real db.

I've tried both Subversion and Arch by dozer · 2004-03-07 20:14 · Score: 4, Informative

Subversion good points:

Finger feel is very similar to CVS
Flexible directory layout & tagging
Extremely stable development.

Subversion Bad Points:

Database & log files take up a LOT of space.
Quite hard to share repositories
No way to mark your branches (if you accidentally check out the directory containing your branches, you just got 50 gigs of 99.9% identical files...)
No distributed development
Pretty weak merging

Arch Good Points:

Extremely good distributed development
Super easy to share repositories
Pretty strong merging.
Very stable development

Arch Bad Points:

Forces you to give your projects weird names ("my-project--branch-1--1.1").
Forces each branch into a different top-level directory in your archive ("my-project--branch-2--1.1").
Doesn't feel anything like CVS.
Pretty slow (but they're working on it).
Somewhat difficult to resolve merge conflicts

I wish I could love Arch because distributed development absolutely rules. I could tolerate its bizarre command set, but I simply won't accept arbitrary (and ugly) constraints on what I name my projects and branches.

Verdict: I'm still using CVS. Subversion is very close to pleasing me enough to switch... I'll probably ditch CVS some time this year.

Re:I've tried both Subversion and Arch by natmsincome.com · 2004-03-07 21:54 · Score: 4, Informative

Some of your Bad points for Subvresion don't sound quite right:

*Quite hard to share repositories

The repositories can be read using any WebDAV complient software. If your talking about on the web the articles says you can use viewcvs as a web interface. If you want poeple to connect to the server then it should be setup by default as it's client server.

*No distributed development

If your talking about multiple servers like bitkeeper then I can't help you *I know nothing* but if your talking about client server then there's a misunderstanding as it's been designed to be client server.

I may have misunderstood what you were saying but the comments were a bit vague.
Re:I've tried both Subversion and Arch by Adrian · 2004-03-07 22:21 · Score: 1
Database & log files take up a LOT of space

This has got a lot better recently, and with the latest Berkeley DB you don't have to worry about cleaning up the log files. I find that CVS and subversion repository size are now roughly the same.
- Quite hard to share repositories
- No distributed development
- Pretty weak merging
The SVK project (basically distributed repositories built on top of subversion) is addressing a lot of these issues. Seems to be coming along nicely. The merge support isn't quite at the same level as arch yet, but the naming and command line syntax is a lot nicer IMHO.
Re:I've tried both Subversion and Arch by dozer · 2004-03-07 23:22 · Score: 3, Informative

Quite hard to share repositories
The repositories can be read using any WebDAV complient software.
Ever tried setting up a WebDAV server? That fits anybody's definition of hard. The Subversion team recognize this, so they allow you to access the repository over ssh too (thank goodness!). Problem is, everyone using ssh must log in to the same user account or the permissions get screwed up. So, yes, it's quite hard to share repositories in Subversion.
No distributed development
If your talking about multiple servers like bitkeeper...
Um, yeah. OK, allow me to be slightly clearer: Subversion does not support decentralized development. Not at all. It's a major limitation.
Re:I've tried both Subversion and Arch by W2k · 2004-03-08 00:24 · Score: 2, Informative

Ever tried setting up a WebDAV server? That fits anybody's definition of hard.

I strongly disagree. Setting up a Subversion repository to be accessible over the 'net was PISS EASY, even for me, a first-time user. You can use the included light-weight server (svnserve) or Apache2 if you need options like complex authentication. It's very easy to set up and very nice to look at if you enable XML output. :)

There are howtos in the Subversion book. Happy reading.

--
Quality, performance, value; you get only two, and you don't always get to pick.
Re:I've tried both Subversion and Arch by Anonymous Coward · 2004-03-08 00:57 · Score: 3, Informative

Problem is, everyone using ssh must log in to the same user account or the permissions get screwed up. So, yes, it's quite hard to share repositories in Subversion.

i do believe that is wrong. using ssh for access the users need to be in the same group, and the repository directory needs to be sticky and writable to that group.

once setup correctly there is no problems with ssh access by multiple users.
Re:I've tried both Subversion and Arch by an_mo · 2004-03-08 04:00 · Score: 1

Problem is, everyone using ssh must log in to the same user account or the permissions get screwed up. So, yes, it's quite hard to share repositories in Subversion.

I think you are wrong. I log in to my repo from different ssh accounts without problems. Using cvs + svnserve with multiple accounts is also possible in windows XP
Re:I've tried both Subversion and Arch by nthomas · 2004-03-08 06:32 · Score: 1

Subversion Bad Points:
Database & log files take up a LOT of space.
svnadmin comes with a command that you run on your repository called list-unused-dblogs, it will tell you what Berkeley DB log files are unused, which you can then delete. But usually people will want to just run:
svnadmin list-unused-dblogs repository | xargs rm
All of this is moot if you are running Berkeley DB 4.2 or greater -- it cleans unused log files automatically.
Quite hard to share repositories
Decentralized repositories is one feature Subversion does not have (yet). But take a look at SVK which is what the Subversion developers currently recommends to anyone looking for this feature.
No way to mark your branches (if you accidentally check out the directory containing your branches, you just got 50 gigs of 99.9% identical files...)
Which is why the current best practice is to lay out your repository like this:
/ /trunk /tags /branches
This way, you put your main trunk in /trunk, and all your branches would go into /branches. Now when you go to check something out, check it out from the appropriate directory.
You did read the online Subversion O'Reilly book, didn't you?
No distributed development
I don't know what you mean by this, and how it differs from your "shared repositories" point above. Can you disambiguate?
Thomas
Re:I've tried both Subversion and Arch by FlashHamster · 2004-03-08 09:34 · Score: 1
To "Subversion Bad Points", add:
- Lacks the ability to preserve file time stamps on import
Not wanting to loose information on a project with 3 years of history (partly in time stamps), i am still stuck with CVS, too.
Re:I've tried both Subversion and Arch by empty · 2004-03-08 18:52 · Score: 1

Subversion does not support decentralized development.

Check out svk. That project uses svn as the base filesystem for a distributed version control system, a la bitkeeper. I really know nothing about it, but it may be (sort of) what you are looking for.

By the way, why is lack of support for decentralized development a "major limitation"? The only (open source) project I've heard of that needs it is the kernel. Most commercial software also appears to be centralized.
Re:I've tried both Subversion and Arch by StrawberryFrog · 2004-03-09 00:15 · Score: 1

Um, yeah. OK, allow me to be slightly clearer: Subversion does not support decentralized development.

Client-server source control systems in general were created to support decentralised development.

I'm hacking away at my copy of the source over here, you on yours over there, and the central archive on Sourceforge keeps us consistent. You, I and the other hackers are thus not constrained (centralised) in our development, as were would be without any source control at all.

Maybe the lack of decentralised version control is a limitation, but it's not currently a lack that most people feel.

--
My Karma: ran over your Dogma
StrawberryFrog
Re:I've tried both Subversion and Arch by Anonymous Coward · 2004-03-14 12:35 · Score: 0

"No way to mark your branches (if you accidentally check out the directory containing your branches, you just got 50 gigs of 99.9% identical files...)"
Well, no way to help you if you 'rm -Rf /' as root either. That's why you have the 'svn list/ls' command, which is certainly a good idea, as I had the problem you mentioned with CVS too when I checked out repositories for multi-projects.

Binary files by ggeens · 2004-03-07 21:26 · Score: 4, Informative

Do developers out there voice the need to store binaries?

There are definitely reasons for storing binary (non-text) files in a version control system:

Images: quite obvious. You want to version all your artwork. For web-based projects, this can be a large part of your system.
External libraries: if you use third-party libraries, it makes sense to store them in the version control system. If you need a particular build, you check out the correct revision. This allows you to build the exact same binary as it was delivered before. (Of course, if you have the sources to the library, you might want to import them into your project. But if you don't change the sources, that might be overkill.)
Compiled files: some people like to store all object files into version control. Again, this allows you to retrieve a specific version faster (no need to recompile). Personally, I would do this only if the compilation takes too much time.
Documentation: whether you use MS Office or OpenOffice.org, documentation will be in a binary format. (OOo uses compressed XML.)
Test data: you might want to version your test cases, and those will consist of binary data.

--
WWTTD?

Re:Binary files by marcovje · 2004-03-08 00:49 · Score: 1

Agree,

and when not using an open source compiler: binary project files.

how do you migrate? by DeadSea · 2004-03-07 23:03 · Score: 2, Interesting

I can't switch unless we can convert our repository from cvs. Are there tools for doing this?

Re:how do you migrate? by TwistedSquare · 2004-03-08 01:20 · Score: 2, Informative

I asked this last time subversion appeared on slashdot, you can go see my comment and its helpful reply
Re:how do you migrate? by mgm · 2004-03-08 01:44 · Score: 5, Informative

Yep, Subversion comes with a conversion script, cvs2svn, which is under very active development right now. It's not quite so wonderful at converting CVS repositories with complicated branches, so you'll want to double-check the conversion, but lots of people are reporting success converting huge multi-gig repositories over to Subversion.
Re:how do you migrate? by Moonbird · 2004-03-08 01:44 · Score: 5, Informative

Look here...

--

--
All extremists should be taken out and shot.

It helps just a little by r6144 · 2004-03-08 00:46 · Score: 4, Interesting

I have used Subversion in quite a few (small, mostly one-man) research projects during the last six months. Before then I used RCS/CVS. Subversion does make me somewhat more comfortable, and I have little to complain about it, which means I probably won't ever look back.

However, IF there is no free software like Subversion, I'll rather do with CVS than using non-free stuff even if someone else pay the money for me. For example, CVS does not have atomic commits, so I use tags instead (ironic since CVS does tagging quite slowly, but still acceptable for one-man projects). Other weak points of CVS can also be worked around. It isn't pretty, but not THAT painful either. Actually, before I discovered RCS, I just did version control manually by saving a tarball after each day's work, which is tedious but still sufferable.

Of course, for large projects, version control is much more important.

Graph? by aled · 2004-03-08 02:17 · Score: 2, Interesting

Is there any client front end for subversion that makes a graphical tree of versions, like wincvs or cervisia? It's a very useful feature and I would like to have something equivalent for subversion.

--

"I think this line is mostly filler"

Re:Graph? by danielrall · 2004-03-09 07:05 · Score: 1

Not yet. With a 1.0 release out, I figure something like this is likely to turn up soon.

--
Daniel Rall

Any GUI Clients? by tjmsquared · 2004-03-08 02:49 · Score: 2, Interesting

Are there any GUI clients like wincvs for subversion yet? It looks like a much better tool, but I don't see my group switching unless there is a client that is at least as good as wincvs.

Re:Any GUI Clients? by Anonymous Coward · 2004-03-08 04:49 · Score: 0

TortoiseCVS is GUI so I presume TortioseSVN is too
Re:Any GUI Clients? by Anonymous Coward · 2004-03-08 06:33 · Score: 0

I've been using TortoiseSVN. Works like a charm.

You want RapidSVN by Valdrax · 2004-03-08 07:48 · Score: 2, Informative

That's a pretty good question in my opinion, and TortoiseSVN's Windows shell-extension doesn't cut it. ("-1, Redundant" my ass.) If you're looking for something more like WinCVS, check out RapidSVN.

--
If it's for-profit but free, you're not the customer -- you're the product (e.g., the Slashdot Beta's "audience").

Atomic commits by metallidrone · 2004-03-08 08:22 · Score: 1

Actually, atomic commits means something totally different from global revision numbers. Having atomic commits means that a software failure during repository-modifying activities leaves everything in a well-defined state. That is, if you are committing your changes and the network connection dies, your computer dies, the OS or source control client crashes, you kill the source control client, etc., then your repository should not be corrupted (and it will be as though you never committed at all). With CVS, some files may have gotten committed to, and others not, leaving the repository in an unknown/inconsistent state.

Re:Atomic commits by Anonymous Coward · 2004-03-08 08:37 · Score: 0

Well... they're two sides of the same coin...

Meta data and Moves by irontiki · 2004-03-08 09:04 · Score: 2, Interesting

I've been using CVS in professional development environments for about 5 years at several different employers. I love CVS but have been watching Subversion closely and with some anticipation.

The atomic commits will be nice but honestly the lack of them has never been a huge problem for my teams (atomic commits are probably less a problem with 6-8 people). The things that do bug me about CVS that Subversion is supposed to address :

1. the ability to move or rename a file w/o losing the history

2. the ability to set file permissions

3. ability to remove unused directories

I know that these things can be achieved by tweaking CVS's files manually but that's a long way from elegant. It's been a stumbling block when I'm trying to introduce a new team to CVS.

SourceForge? by mattgreen · 2004-03-08 14:35 · Score: 1

Someone want to forward this to the guys at SF? I'd like to know what became of their, "we'll add subversion once it matures enough" claim.

Re:SourceForge? by endx7 · 2004-03-08 17:59 · Score: 1

Someone want to forward this to the guys at SF? I'd like to know what became of their, "we'll add subversion once it matures enough" claim.

Sourceforge seems to have been having some serious growing pains. I'd hope they fix their current problems first before adding more things that could break.
Re:SourceForge? by Anonymous Coward · 2004-03-13 13:08 · Score: 0

Forward it yourself you lazy ass... ;-)

svn externals != cvs modules by Anonymous Coward · 2004-03-08 14:54 · Score: 0

I spent all weekend playing around with svn, actually, cvs2svn is still converting my 3 GB cvs repo...

There's two things holding me back currently, the long ( and possibly broken ( cvs2svn.py is not 1.0 ) ) conversion process, and the lack of decent support for the cvs modules file.

I think I _might_ be able to convince the rest of the developers that a clean switch might be ok, but there's no way around the heavy use of the modules file.

What we do is have the projects that you checkout as modules in the modules file, each of those include the common parts that are across the two platforms, as well across projects. we also build some of our components as libraries, and we also include third party libraries. when we're ready to ship something, we tag that module in cvs, which tags not only the source for that project, but all the common code, our libraries, and external libraries. This makes it very easy to share code across projects, yet retain an easy checkout/build/rebuild. I don't see how I can do this with subversion and the externals file... =(

--patiently waiting on the svn:externals...

Hot backups to plain text by danielrall · 2004-03-09 06:56 · Score: 1

Hot backups to plain text make the live data storage format largely irrelevant. See `svnadmin dump --incremental`, `svnadmin hotcopy` (and its wrapper script `hot-backup.py`) as documented in the open source "Version Control with Subversion" book (another fine O'Reilly tome written by some of the core developers).

http://svnbook.red-bean.com/html-chunk/ch05s03.h tm l#svn-ch-5-sect-3.6

Sure, most of us have edited ,v files by hand at one time or another, but Subversion has built-in commands replacing almost every non-corruption use case for that insanity. The operational procedure for handling data corruption -- which as never happened to date -- is backups, not hacking at the raw data storage format and praying.

--
Daniel Rall

Yes ease right in by Anonymous Coward · 2004-03-09 14:45 · Score: 0

Yes you can ease right into subversion like Taco eases right into Timothy's backside.

No, No., No to efficiency by Anonymous Coward · 2004-03-11 05:20 · Score: 0

If you speed up CVS, I'll have no time to read /.

Re:Any GUI Clients? Tortoise by bstil · 2004-03-12 06:08 · Score: 1

Yes, if you've used TortoiseCVS before, you might want to check out TortoiseSVN...

It integrates into Windows Explorer and allows you to do all the updates, commits, etc with right mouse clicks.

Re:Is there demand? atomic commits by bstil · 2004-03-12 06:14 · Score: 1

Also, have there been many problems that required atomic commits? Can someone explain why this is important?

Well, to database developers, the thought of having SQL scripts committed WITHOUT atomic commits is very scary. I use CVS to record the SQL DDL scripts for database generation (and backup). If I commited a new table.sql script, for example, and that conflicted with a sequences.sql, which was not commited atomically, my database keys could completely meltdown...

Fortunately, we don't have enough developers with CVS for that to be a problem, but I plan to move us to Subversion soon.

build number, j2ee by bstil · 2004-03-12 06:16 · Score: 1

Yeah, I love the fact that there's a revision number that's global to the whole repository. We embed that number into each build of our product and our testers file bugs against a particular revision.

Has anyone done that for Subversion with some Java build tools like Ant or Anthill? Do you incorporate the build number into your WAR or EAR file?

Re:build number, j2ee by spongman · 2004-03-12 06:31 · Score: 1

i'm sure you could do it quite easily. We're using .NET and building with VS.NET, so we just embed the /.svn/entries XML file as a resource in the exe, and query the revision # at runtime using xpath. I'm sure you could extract the revision # at build-time if you're using something more powerful like ant.

Slashdot Mirror

Ease Into Subversion From CVS

130 comments