Home Directory In CVS
shamir_k writes "Joey Hess has come up with an innovative solution to a problem we have all faced. He's put his whole home directory in CVS. Not only can he move between multiple computers easily, he also has automatic distributed backups."
I wonder why CVS, and not something more advanced, like Arch or Subversion? Especially since he outright complains about common limitations in CVS, like moving files and dealing with directories at all. If he's hoping, as he says, "for a better replacement some day", why not see what the present has to offer?
I mean, that's not to say that alternative systems are perfect, either. I'm going through the process of learning arch now. There's a learning curve, but not nearly as big as it's made out to be. Still, using something else (almost anything else) would probably help on things like the merging issues, especially since he mentions that sometimes it's a pain keeping things in sync between three of his machines.
You wouldn't do this in Bitkeeper if you were a privacy freak. Remember, the Bitkeeper liscense requires that you maintain open logging. Realistically, this means that info about what files you change get transmitted across the network; I don't know if it's encrypted or not. It's not that big of a deal, but I'm sure someone here would care.
/home repo across (say) three or more machines, since you could take advantage of its more complex merge operators. Arch or SVN also might be good ideas. (Don't have any experience with Subversion, though.)
That being said, doing it in BK would be a compelling alternative if you wanted to use the same
It had a filestore with file versioning - about 30 years ago.
Joey shows you how to keep track of everything with CVS.
.zshrc or .procmailrc, I could roll back to the previous day's or look back and see when I made the change and why. It's very handy to be able to run cvs diff on your kernel config file and see how make xconfig changed it. It's great to be able to recover files you deleted or delete files because they're not relevant and still know you've not really lost them. For those amateur historians among us, it's very cool to be able to check out one's system as it looked one full year ago and poke around and discover how everything has evolved over time.
I keep my life in a CVS repository. For the past two years, every file I've created and worked on, every e-mail I've sent or received and every config file I've tweaked have all been checked into my CVS archive. When I tell people about this, they invariably respond, ``You're crazy!''
After all, CVS is meant for managing discrete bodies of code, such as free software programs that are worked on and available to a lot of people or in-house projects that are collaboratively developed by several employees. CVS has a reputation of being a pain to deal with, and it has a lot of crufty bits that regularly drive users up the wall, like its mistreatment of directories. Why inflict the pain of CVS on yourself if you don't have to? Why do it on such a scale that it affects nearly everything you do with your computer?
I get three major benefits from keeping my whole home directory in CVS: home directory replication, history and distributed backups. The first of these is what originally drove me to CVS for my whole home directory. At the time, I had a home desktop machine, two laptops and a desktop machine at work. Rounding this out were perhaps 20 remote accounts on various systems around the world and many systems around the workplace that I might randomly find myself logging in to. I used all of these accounts for working on the same projects and already was using CVS for those projects.
I'm a conservative guy when it comes to my computing environment (I've used the same wallpaper image for the past five years), and at the same time I'm always making a lot of little tweaks to improve things. Whenever I go to work and something wasn't just like I had tweaked it the night before, I'd feel a jarring disconnect, and annoyingly copy over whatever the change was. When I sat down at some other system at work, to burn a CD perhaps, and found a bare Bash shell instead of the heavily customized environment I've built up over the past ten years, it was even worse. The plethora of environments, each imperfectly customized to my needs by varying degrees, was really getting on my nerves. So one day I cracked and sat down and began to feed my whole home directory into CVS.
It worked astonishingly well. After a few weeks of tweaking and importing I had everything working and began developing some new habits. Every morning (er, afternoon) when I came into work, I'd cvs up while I read the morning mail. In the evening, I'd cvs commit and then update my laptop for the trip home. When I got home, I'd sync up again, dive right back into whatever I'd been doing at work and keep on rolling until late at night--when I committed, went to bed and began the cycle all over again. As for the systems I used less frequently, like the CD burner machine, I'd just update when I got annoyed at them for being a trifle out of date.
It only took a few more weeks before the advantage of having a history of everything I'd done began to show up. It wasn't a real surprise because having a history of past versions of a project is one of the reasons to use CVS in the first place, but it's very cool to have it suddenly apply to every file you own. When I broke my
The final major benefit took some time to become clear. Linus Torvalds once said, ``Only wimps use tape backup: real men just upload their important stuff on FTP and let the rest of the world mirror it.'' I'm not a real enough
I have no sig, the eyebrows seal the deal. That's right. Eyebrows.
I'll get modded down to oblivion for mentioning an MS product in a positive light, but Windows XP+2003 Server supports this already.
Users can rollback to previous revisions of files that they've saved to the 2k3 server, saving the sysadmins the time of restoring *another* accidently deleted file from the backup tapes.
To make this more legible- a text editor, for example, does not have the file open the entire time waiting for input. It opens the file, reads it, then closes it at startup. When the user hits save (through keyboard commands, mouse click, whatever), the editor opens the file again, this time in write mode, writes the data, and closes it. By this model, close could be the commit function, and open could be the checkout function.
I still have more fans than freaks. WTF is wrong with you people?
Ever hear of VMS ? It had a filestore with file versioning - about 30 years ago.
Not just VMS. Apollo DOMAIN had something like this, too. -jh
It's been quite a while since I used VMS... IIRC the problems we had we the VMS versioning:
- It created a new version every time you saved, so just going through a few change/compile/fix cycles (for example) would create lots of versions clogging up the disk.
- The old versions were in the same place as the latest version, and if you wanted to delete a file, you'd just say "delete blah.blah.*" to wipe out all versions (and therefore all traces of the file)... then say "oops!"
It was useful in many cases, but in a different way from CVS. A very useful solution would be to have file-level journaling with the ability to throw in comments and create tags and branches.
Personally I prefer Aegis. It's a bit more complicated to use than CVS (well, aegis is MUCH easier to install than subversion), but takes care about way more situations for you.
Less is more !
I have a script that does all of this for me:
http://bleu.west.spy.net/~dustin/soft/filemonitor
You point it at a dir and run it from cron nightly. It also gives you a handy nightly mail telling you what changed. Excellent for those late night changes to systems where you don't remember what you did...or if someone else made some late night changes that you'd like to undo.
-- The world is watching America, and America is watching TV.
This isn't insightful. He specifically states in the article that he has a .hide directory that _doesn't_ get "sown like seeds across a number of systems" just for this very reason.
Bah.
I'm not sure why something that I wrote in 2001 and that appeared in print media in 2002 is news.
This is the second time I've been slashdotted for something over 1 year old this year. Previously it was the pkg-comp page, which I wrote circa 1998.
Kinda makes you wonder.
Anyway..
I suppose I should mention that these days I keep most of my home directory in subversion. I have not gotten around to writing a successor to this article yet, but it works even better than cvs, and that's probably the most common question people ask me about this article these days.
see shy jo
iFolder, for those that don't know, is Novell's distributed folder. Work done on any computer is synchronized with a server and automatically distributed and backed up to all other clients authenticated as the same user and running the iFolder client. A simple concept that proves decidedly valuable.
I attended a conference, today actually, about Novell's jump into Linux and iFolder was stressed again and again as an excellent cross platform synergy device. I was thinking through the whole conference that couldn't you just do this with CVS, but then I realized iFolder's true advantage.
iFolder lets you authenticate against a netware tree, access with far less hoops to hop through, and provides easier administration (through iManager or ConsoleOne).
Just something I thought you should checkout if CVS doesn't quite fill your needs.
Wrong. BitKeeper does not require open logging for single-user, single-host repositories.
Or you can just use Gentoo, which does this automatically when you update your system, pointing out the location and files that are different, all with diff output, and the ability to merge the changes, overwrite or ignore the change.
Open Source Java Web Forum with LDAP authentication
This was featured in a Linux Journal article from September 2002:
http://linuxjournal.com/article.php?sid=5976
Same guy, too.
cvs automatically sense binary files these days. No need for the explicit -kb.
I used to depend on exactly that technique to keep my home directories in sync. It was a pain in the butt. If you have a file named "foo" in replica A and no such file in replica B, then rsync has no way to tell whether it should delete "foo" from A or copy it to B. For that you need to know some minimal information about what the replicas were like at the time of the last synchronization. Unison takes care of all this for you in a very clever way. It's really worlds better than just using rsync. I highly recommend it.
--Bruce Fields
Where do you think Gentoo got this from?
Debian.
I've been using Backup4l and am quite happy with it.
It's a multilevel incremental backup tool that every night mounts an extra HD that I use for backups and puts only the changed files to tar.gz files. It also deletes old file on the fly and I currently store about 6 months of history.
Restoring files is very easy ('backup4l --restore */pr0n/*.jpg', optionally with a specific date), no hassles with manual commits, binary files and removed, moved, renamed files. Sure, it's not version control per se, but works fine for me.