Domain: dirvish.org
Stories and comments across the archive that link to dirvish.org.
Comments · 21
-
Re:Roll your own?
Dirvish does exactly this.
-
Re:This just gave me a good idea!Yep, and then you don't have to worry about
- Changes in permissions/mtimes/atimes corrupting all your old backups because all of them are hard linked, or alternatively
- Changes in permissions/mtimes/atimes causing an entire file to get copied
There are also other things to worry about. To be fair, the guy who invented --link-dest wrote a backup program called Dirvish so that is a better comparison to rdiff-backup.
-
Don't use rsync â" at least, not vanilla
Don't use rsync to make backups. Because you don't just want to backup against spontaneous combustion â" inevitably, there will be accidental deletions and the like occurring in your studio. If you use rsync (with --delete, as any sane person would, otherwise your backup server will fill up in days, not years), then when some n00b runs `rm -rf ~/ReallyImportantVideos`, they'll be deleted from the backup too.
Remember that pro photography website that went down, because their "backup" was a mirroring RAID setup? Yep â" they lost all their data on one fell swoop when somebody accidentally deleted the whole lot. Don't make the same mistake.
Use an incremental backup tool. Three that come to mind are rdiff-backup, Dirvish, and BackupPC.
I would think that rdiff-backup would suit your needs best. I currently use BackupPC at home, which is great for home backups, but I think that it's overkill (and possibly a bit limited) for what you want.
Hope this helps!
-
raid? lvm? backup? optical? answers here!
Some people already mentioned Raid - please remember that raid is primarily for high availability and it is not a reliable solution to backup. A raid will not save you from power supply blowing and taking out mobo+hard_drives, it will not help you when your raid-5 suffers a second failure while rebuilding with a fresh drive or hot-standby, nor will it help you when your house is involved in a destructive act of "mother nature".
As a side note, I would stay away from volume managers as well - just follow the KISS (keep it simple stupid) principle.
Here is my setup at home - not high availability, but cheap and reliable:
A Linux/FreeBSD file server setup with individual mountpoints, no raid, no volume manager. You get all the capacity purchased, a single modern SATA drive can basically fill a 1Gbit link (so speed is not limiting) and if one drive fails, you only loose that mountpoint (as opposed to one huge linear/append volume being affected).
A Linux/FreeBSD backup server (identical drives to the first machine - mine is a low end PII/400 with 384MB ram). Same simple mountpoints and a nightly cron job doing incremental rsync with multiple directories full of hard-links to non-changed files (lookup dirvish - http://www.dirvish.org/, I'm actually using my own rsync wrapper based on the same idea).
Structure your storage on the file server so that you have archive and newstuff directories. Anything that you create/generate goes into newstuff, once you have enough to fill a DVD_or_blueray, burn it, deposit it into your security box at the bank and move to 'archive' (never touch anything in archive - make it RO).
You now have a robust, cheap and simple file server, a backup server with 30-90 days of incremental backups in case you need to restore files, and a stack of optical media in the bank. In case of a disaster, just copy the optical media into replacement server's archive directory. Just remember to start a fresh optical media dump every ~5 years - they don't last much longer than that.
I have a 1.75TB file server, a 1.75TB backup server, a security box full of DVD+Rs and the "system" is hands-off (other than burning a few DVDs every 1-2 months).
Good Luck!
M.
ps. make sure to put your machines on UPS and use XFS for best overall performance with large filesystems with tons of hard-links where rsync is stat-ing everything each night...
-
Dirvish
... or give a look at Dirvish. It uses rsync and keeps full snapshots using hardlinks for unachanged files. Works like a charm for me.
-
Rsync Backup System
If you are interested in concept of using hardlinks to simplify your back experience, you can either set something up manually with rsync, or you can check out Dirvish: http://www.dirvish.org/ .
As other have pointed out, the pieces that are missing are:
1) No directory hardlinks, meaning you backups take up more space than you would like, since you have to create the whole directory structure, and then hardlink each file.
2) No FSEvents, so a full tree compare must be done by rsync each backup. So backups are not instantaneous, and are a bit intensive
3) No nifty interface
All that said, if you read the ArsTecnica article about how the Time Machine backups are stored, they are stored in a manner almost identical to Dirvish.
For me Dirvish makes it trivial to have a backup taken each night, with various expiration rules, and each backup being essentially an incremental backup. -
Re:Innovation
Or, you know, maybe it was just Time Machine that is ripping off Dirvish, which I've been using to do backups on my fileserver for years.
-
I'm too lazy to do any research...
...but how is this different Dirvish, which has been around for years?
-
Smells like dirvish
This sounds like http://www.dirvish.org/, which is nearly as nice as the automatic file snapshots done by the "Network Appliance" fileserver boxes I've used at the last 2 out of 3 workplaces.
-
dirvish
Dirvish written in perl and using rsync it is a fast disc to disc backup. enjoy.
-
Elegant, reliable & cheap (free) solution
Faubackup. Or perhaps dirvish. Either one works on Linux, and both are are pretty easy to use if you can write simple bash shell scripts. In the case of faubackup (http://faubackup.sourceforge.net/), the backups are made to disk and can be run automatically with crontab. If you combine faubackup with rsync, you can even make automatic backups to other hosts over the Internet. Dirvish also makes backups to disk, but doesn't require rsync for the remote stuff (http://www.dirvish.org/).
However, if you're hoping to find something elegant, reliable & cheap (free) for Windows, I don't think that exists. The Windows world is awash with expensive commercial backup solutions, almost always involving expensive hardware (tapes, yuk). The best way to backup Windows is... by using Linux. If there are any free Windows solutions, I doubt that they can hold a candle to the two mentioned above. -
Re:make one
Actually it is not really necessary to create a new one, because there already is a
pretty good solution for incremental harddrive backup: Dirvish: http://www.dirvish.org/.
It allows you to create configurations and have regular incremental backups with easy recovery of arbitrary previous states.
It uses pre-known techniques such as rsync and hard links. -
Version tracking? Noting redundant files?
If you tracked deltas within files, you could look to xdelta as a filesystem, or possibly CVS.
If you were just tracking changed files, you could look to Plan 9 filesystem or Dirvish.
What might be up: Picture backing up a number of fairly similar machines (say, a group of Windows machines built from a common image), & noting duplicated files, only saving each once. You could count the space saved by a link as compression. If you have a homogeneous sample, you save lots of space & claim ridiculous compression. -
Looks like Dirvish, but less features.
You might want to take a look at Dirvish ( http://www.dirvish.org/ ).
I use it at work all the time. Dirvish can handle multiple backups using hardlinks, thus reducing the required space while keeping full images.
From their site:
Dirvish is a fast, disk based, rotating network backup system.
With dirvish you can maintain a set of complete images of your filesystems with unattended creation and expiration. A dirvish backup vault is like a time machine for your data. -
Re:Neat.
I prefer Dirvish, and I highly recommend that people looking for a good harddisk-based backup system take a look at it. I've looked long and hard for a good backup system and this is the first that seems to fit the bill for me.
-
Use hard links, rsync, big redundant disk array.
I keep a lot more than 50 days worth on line. And I get effectively more than 90% compression. And individual users can do their own restores from their own desktops.
Look at how dirvish works. Or rsnap, or rsync-incr, or rsnapshot, or ribs-backup, or indeed any tool based on Mike Rubel's basic idea.
I use a homebrew variation that is suited to my employer's unique needs and infrastructure. You may find it expedient to do the same. I don't save any metadata other than the snapshot date for each tree, and I use data mining techniques (well, actually I use find and gawk from command line) if I want to determine what's going on or how the system is doing.
It has run for years with no maintenance other than periodic OS patches. It is not our primary backup system because it does not support off-site archival, but it's well worth the investment for rapid restore of user-deleted files. I'll consider this array (I'm currently using linux soft raid 1+0 on two physically separate busses) when I need more disk eventually. -
Linux isn't restricted to binary-only RAID manager
You flamed the other guy for being "not particularly informed" and then you post "I don't want to be hold hostage to some binary-only shoddy RAID managment software running on Linux"?
I've been running completely open-source soft RAID for years on Red Hat linux. My backup server, which uses the same basic idea as dirvish, uses a couple of terabytes of RAID10. There are even multiple RAID implementations freely available, although you are typically restricted by your choice of kernels.
You zealots never seem to realize your conception of the system you disdain is necessarily going to be incorrect, because you aren't going to spend the time required to really understand it. Concentrate on cheerleading you chosen religion's good points and stop trying to point out the other guy's bad points, that way you can show some real insight. -
Re:full article mirror & comment
http://www.mikerubel.org/computers/rsync_snapshot
s /
I've implemented a system like this where I work and it's quite nice. However, a nicer option exists in Dirvish which does all the rotating for you. -
Rsync and Dirvish for disk-to-disk backupI host dirvish ( http://www.dirvish.org/ ), a backup application for Linux/Unix, using Rsync and Perl. Like Chuck Messenger, I rotate the target drives. You can only trust an air-gap between your backed-up data and a hostile world.
Rsync ( http://rsync.samba.org/ is really great for backup of Unix-like systems. The ability to hardlink identical files allows me to store hundreds of daily full images of 100GB of sources to a single target 250GB hard disk. Rsync is very smart about moving only changed data over the network, resulting in speedups of 10x to 100x. This allows me to do full backup on my offsite colo without using a lot of bandwidth. Note that Rsync is great for Mac/Unix/Linux, but it does sometimes have problems with windoze clients. But then, so do I
...Dirvish (originally written by jw schultz) is a Perl wrapper around Rsync. It facilitates the scheduling and management of Rsync based backups. We have a fairly active mailing list and contributions from around the world (open source is so cool!).
Backups should be safe against:
- Failed hard drives
- Stupid mistakes
- Enemy action
- Fire, flood, and theft
- Host and power supply failure
- Unauthorized access
Backups should be automatic (or they will not get done) and cheap (hard disks are cheaper than tape, and much cheaper when you use hard linking). Rsync stores the data in a file system closely approximating the original, which facilitates restores.
If a cheap electrolytic filter capacitor dries out in your power supply, and the 5V output decides to start making a 15V squarewave instead, everything in your computer case will get fried. Including every one of the RAID disks. External USB enclosures (or airgaps!) protect against host and power supply failure.
If I was really paranoid about protecting my data, I would run a long ethernet cable to a nerdly neighbor a few houses away, and put a second dirvish server there. While I do rotate my drives into ziplok bags in a fire-resistant safe, the maximum credible accident (a furnace explosion) would tear open the firesafe. If I was paranoid and rich, I would use a high bandwidth VPN connection to a big disk in a colo machine in a different city.
The best backup is server-pull, frequent, automated backup onto multiple R/W media in multiple places, and frequent checking of that data. The closer you can approximate this, the more secure your data will be.
Keith
-
Re:Automatic Backup for Paranoids?
Use rsync and hardlinked snapshots. There are lots of examples out there. I rolled my own a while back, but if you want something relatively nicely polished and based on that idea, check out dirvish (I didn't find that until after I already had my system set up).
I really like having several months worth of nightly snapshots, all conveniently accessible just like any other filesystem, and just taking up slightly more than the space of the changed files. -
Re:backing up will still take 50 disks
Personally, I have an old computer with a couple of extra disks, running dirvish. It's cheap, and has enough capacity to backup my homedir and some other stuff as well, and dirvish offers snapshot like capability (uses hardlinks to save space).