NetBSD's Real-Time Network Backup
jschauma writes "One of NetBSD's developers, der Mouse, was interviewed by DaemonNews about his real-time network backup system (originally presented at BSDCan 2005), where changes to your local filesystem are automatically propagated to a backup server. In his interview der Mouse tells about his idea, how it works, and of course, how cool it is."
But hasn't Sun been doing this with Solaris for at least 3 years?
Any fool can criticise, condemn, and complain, and most fools do. - Benjamin Franklin
So we could have backup servers all over the world keeping track of disk write commands...
This is indeed very neat, but isn't it sorta how transactional databases have been working?
I also don't see how this solution is effectively any better than RAID... If anything, a backup server is more expensive than a second hard drive for a RAID system (though it may pay off eventually). I'd think the backup server would need to be maintained as well... and if your backup ever fails, it seems like it would require a lot to set up another.
There also seem to be a lot of limitations as far as network security, filesystems, encrypted files, etc go. Furthermore, I don't see how the bandwidth hit is worth it (though I guess that depends on where your priorities are).
Admittedly, I'm no expert on this topic... so am I totally missing something?
Capitalism: When it uses the carrot, it's called democracy. When it uses the stick, it's called fascism.
Here is the idea behind the setup I am currently using: Easy Automated Snapshot-Style Backups with Linux and Rsync.
Volume shadow storage is exactly this kind of incremental, real-time backup process. How does this differ technically from that? (Other than the fact that you can now dynamically back up your morning toast, which is useful if a slice goes up in flames...)
This idea is really cool, but implementing it by putting hooks into each device driver seems overly complicated. It also doesn't sound like they're any sort of priority setting for this or any type of data filtering.
Personally I'd like to see something like the MS filesystem in development that allows SQL calls to be run against it (not sure if there's any other filesystems that are similar). Query every 5 minutes for changed data that fits the backup parameters (within the system dir, the user's home dir, certain filetypes) and then transfer the data as the network isn't being used.
That would achieve the same thing, but more flexibly and without affecting normal use.
You have enemies? Good. That means you've stood up for something, sometime in your life. --Winston Churchill
Sounds like it's essentially a DoubleTake daemon for BSD, cool, I wonder how well it scales? Say if you wanted to fully mesh 10 or more servers or something. Sounds like it might come in handy for keeping the content in web farms in synch as well....
Obviously, RAID servers don't help you in the case of accidental deletion. And they certainly don't help if your whole computer gets blow up.
Still, you'd want to be careful with this, it would suck to back up all the temp files generated by random processes.
autopr0n is like, down and stuff.
This concept has been in play for years as a commercial product for Disaster Recovery, Veritas Volume Replicator (VVR).
Those crazy Germans.
quidquid latine dictum sit altum videtur.
How does this compare with Linux Network Block Device? Sounds very similar.
There are pretty mature commercial tools for this stuff, as well - Veritas' VVR replication comes to mind.
Shouldn't this technically be called a point in time recovery solution? When I think of a backup solution, I expect to be able to retrieve arbitrary files from an arbitrary point in time. Also, rather than mucking with the kernel, wouldn't it have been simpler to use the geom system?
Isn't this guy reinventing the wheel? Why not just run a RAID 1 setup using iSCSI? Wouldn't that accomplish the same thing a lot easier?
...how do you get ALL the data on the backup server to start with? Pushing the writes off to the backup server in real-time is identical to what the HP VA7410 SAN I work with does internally in RAID 1+0 except that this happens over the network. But how are the disks in the backup server ever going to get all the original filesystem data if that data already exists AFTER you build your backup server? Even if you have a log of writes, you can't reconstruct the data. You'll only be able to reconstruct recent changes.
-"...bad old ideas look confusingly fresh when they are packaged as technology" - Jaron Lanier (Digital Maoism on Edge.o
I've been doing this with a web-based system. Not as direct but works automatically when you connect to the site. Platform independant that way.
At what price learning? At what cost wisdom? The price is a man's peace of mind, and the cost is his life.
How is this any different from DRBD (http://www.drbd.org./
/dev/nbX). Every write is sent to the local 'lower level block device' and to the node with the device in 'secondary' state. The secondary device simply writes the data to its lower level block device. Reads are always carried out locally.
From the website:
DRBD is a block device which is designed to build high availability clusters. This is done by mirroring a whole block device via (a dedicated) network. You could see it as a network raid-1.
Each device (DRBD provides more than one of these devices) has a state, which can be 'primary' or 'secondary'. On the node with the primary device the application is supposed to run and to access the device (/dev/drbdX; used to be
If the primary node fails, heartbeat is switching the secondary device into primary state and starts the application there. (If you are using it with a non-journaling FS this involves running fsck)
If the failed node comes up again, it is a new secondary node and has to synchronise its content to the primary. This, of course, will happen whithout interruption of service in the background.
And, of course, we only will resynchronize those parts of the device that actually have been changed. DRBD has always done intelligent resynchronization when possible. Starting with the DBRD-0.7 series, you can define an "active set" of a certain size. This makes it possible to have a total resync time of 1--3 min, regardless of device size (currently up to 4TB), even after a hard crash of an active node.
In every case I've actually needed backups to date, I find that, if I did them instantly instead of nightly, I'd end up losing data. The most common need for a backup for me comes when I've made a mistake with the main data, and I need to go back to what I had, say, yesterday.
This isn't to say that instant backups wouldn't be nice for failover architectures, though. I just don't deal with systems that large, yet.
How does this protect against an rm -rf against the filesystem... I guess it would trash the backup on the other side.
Every time the server is started, it sends a command to all the clients causing a full sync of all changes that occured while the server was offline. The same thing happens when a client is restarted, it sends a full sync to the backup server, any blocks that do not match the client checksum are re-sent.
Thus the first time you ran this thing it would copy the whole disk image to the backup server. After that subsequent writes would be the only output.
This is basically RAID over the network. Personally, I can't see a lot of use for it... Just put the second drive in the machine, and use software RAID, rather than putting the second drive in a network server. Less network slowdown and congestion that way, not to mention CPU-time wasted packetizing, encrypting, etc.
As always, RAID (and now this) is not a backup solution.
Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
Larry Robertson came up with this concept in the mid 80s as I recall, implemented it for VMS way back then as a remote shadowing system. He told me about it in one of the Anaheim DECUS meetings back then, published it (his company being called Bear Software back then). While the idea was not patented, the idea of moving updates wide area and doing local journalling so that the "shadow" needed only to keep up with average write I/O rates, rather than peak write rates, was AFAIK new back then and he deserves credit for thinking of it and implementing it. If anyone should try to patent it, he could be contacted also to show prior invention and public description. (Another outfit that came out shortly later with something similar had, as I have been told, a copy of the Bear program in house. That suggests my belief is correct that Robertson came up with the idea first, and that duplicate invention did not occur there.)
Great for backing up the internet to your Iomega 80GB USB drive.
It's almost a troll to even mention it, since there are so many things pioneered by Multics....
Oh no! the rootkit got replicated to the backup server!
It seems to me that if you use a journalling filesystem that journals everything, not just meta-data, you can just send the journal logs off to your backup device. Presuming your backup device starts with the same baseline data (i.e. a full level-zero dump) Then you would have the ability to restore your files, or entire filesystem, to the state it was at any point in time just by playing back the journal logs. Presumably a "smart" replay algorithm could be implemented that would use some sort of regular snapshots and "colescing" (sp?) of the logs to speed up restores (reducing the time spent doing the journal replay).
Such an approach would not require hooks into each device driver either, it would be entire at the filesystem level.
His real name is Mike Parker, and he wasn't good enough for OpenBSD. Since then, he has a problem with Theo for not accepting his crap. So he uses Der Mouse (Theo's last name is pronounced "de-rat") as his stupid alias.
Yay for you. Now go back to masturbating over pictures of tux and stop posting trolls with links that are 5 years old.
It needs to do a full rescan on reboot?
UGH!
That kills it for me.
I just wanted to point out that there are several FUSE-based filesystem implementations that do the same thing (functionally, not implementation-wise) and they do not require hooks in the device drivers -- they don't even care what the filesystem is for the original or the backup.
And, yes, RAID is a very good solution if you've got the money and are smart enough to recognize when a disk fails...
This article makes me laugh...get a life and don't troll Slashdot, kthx.
Nerdspeak.net
So what you have is not only a real-time backup, but also the ability to unwind any possibly damage after a break-in or other event. If your backup server is only running a journalling client, it can be made extremely secure, and also provide an excellent auditing tool.
It would also be possible to delay the streams and have arbitrarily old filesystems available. Or to use a local journal as a buffer to smooth out the IO load as it is piped off site. It could also be used to augment a non-journalling filesystem, for crash recovery purposes, assuming your filesystem provides at least some consistency guarantees. In fact Netapp does something similar by logging FS operations to NVRAM, while the filesystem only writes consistentcy points periodically.
Although I haven't read about the NetBSD work, I am sceptical that they could get the error handling to work correctly at that level. With the DragonFly journalling, there is support for transactional consistency, as well as recovering from interrupted network connections. While it is not complete, much of it is in place and functional.
See the mountctl manual page for attaching a journal to a mount, and the jscan manual page for processing the journal.
Good lord, am I the only one who thinks this isn't cool and is possibly the worst implementation of a network backup system?
Now it's possible I'm not seeing this the right way, but as I understand it, the entire backup process is embedded in the kernel. Modularity gets kinda difficult then, and sucks to be you if the backup process at the other end craps itself and pushes data into the IP link that causes it to crap itself. If a change needs to be made to the backup process, then you have to start messing around with the kernel. Not clever.
How about this for theoretical backup process;
backupd asks the kernel to notify it when a directory or file changes.
upon notification, backupd forks a process (or adds file to the queue) and pushes the changed file down a pipe. Since we can chop and change the pipe at will, we can use ssh, or substitute our encryption/authentication methods at any time.
Hey guess what, a normal user can run this on a standard kernel, or root can run the same process to do a backup of an entire directory, and it can be installed on any filesystem. Gosh.
Wouldn't this also replicate deletes across to the offsite machine in near-real-time? So if one were to accidentally delete a file, or a $HOME directory, or a complete filesystem, then there would be no way to recover from this from the "backup" machines, because their files would have gotten nuked too?
Hey, Windows users, there is no such thing as "forward" slash, there is only slash and backslash.
I think there is a great deal of misunderstanding about the mikerubel.org article; rsync/hard link is far better suited to backup than raid1 (which, as you suggest, isn't really a backup at all). For starters, rsync/hard-link backups live on a separate disk on a separate machine. To erase them, you have to gain access to that machine, which may be strongly locked-down--mine is accessible only from the console. Many point-in-time backups are created by the method, though storage is efficient, and they can be NFS exported back, read-only, to one or more clients. It's much easier to verify and restore than tape. If you're really uncomfortable with online backups, buy usb2 hard drives, use the same technique to make backups onto them. You can unplug the hard drives when you're done and place them in a vault. Or build two backup servers and make sure at least one is always powered-off.
This seems like they are really putting emphasis on this feature. Could it mean they might need it if NetBSD dies?
1) Ghost is a piece of shit. I've never talked to anyone who has used it with success. It seems to always corrupt its own files. A simple tarball is better in a number of ways IMO.
2) Ghost is not the same as this. RTFA. This is a change to the device drivers that replicates and records what has been done at a block level over an encrypted network connection to a backup server. This isn't really meant for the average user.
NetApp has been doing somthing similar for a very long time. A lot of people use the Sun boxes on the frontend to boot or attach to the storage appliance and let it do the backups. It saves space and saves the server from having to do it.
oh, wait, what?
trustedworlds.net - gaming, security, and the gunk that lives in between
You are missing a few things like fire, flood, lightning etc.
with the Coda filesystem. Or am I missing something?
LedgerSMB: Open source Accounting/ERP
rsync -avz ~/ user@remote:homebackup
in crontab?
I don't want to read
Actually, he's been around as Der Mouse since I was in college (circa 1985). I ran the xterm replacement he wrote back then. Long, long before Theo had his hissy fit and forked off OBSD. Of course, a trivial Google would have shown that, but hey, an AC would want to miss out on an ad homen flame...
Mod parent up, "Funny"!
Ever considered using RAID1 in a backup system? I've not tried it, but it isn't difficult to see how it can be implemented.
Think of it in a similar way as a tape-based backup system...
1) implement RAID1,
2) have many spare disk drives (they're cheap now),
when you want a snapshot backup :
3) 'fail' one of the drives,
4) remove it,
5) install spare drive and add it to the RAID1 (it'll rebuild automatically),
6) take 'failed' drive off site/lock it in safe/whatever you would do with a tape
I'm not entirely sure how one would restore, but it should be fairly easy.
Max.
Grieving is a process, and it's totally normal to go through feelings of shock, sadness, anger even guilt. The healing process is different for everyone. It might take you six weeks to move on, or it might take you six years. Don't beat yourself up because you're not "over it" yet. It takes time to heal wounds.
So what else can you do to feel better? It might sound corny, but try writing a letter, making a collage, or planting a tree in memory of the operating system you've lost. Remembering and celebrating all the good things *BSD brought to your life might help give you some closure, and having a keepsake to honor *BSD may help you get through some tough times in the future when you'll be missing it.
It's true that life won't be the same without *BSD around. It may seem like you'll never feel better, but eventually you will. Take some comfort in the old saying, "Time heals all wounds," and remember that *BSD will always be with you in your heart.
Why is the "Sections" missing BSD?
In a startling turn of events today, a previously little-known fact came into the public eye: "*BSD Sux0rs". This came as a complete surprise to the BUWLA, or BSD Users With Large Assholes, as they previously thought that *BSD 0wned.
"You see, even though I have never contributed code to any BSD project, I thought it was my duty to be a big asshole to others which don't use the OS I do, because it just 0wnz.", said one FreeBSD user. "Now that I know it sux0rs, though, I have to go find something else to be an asshole about."
One notorious OpenBSD fanatic known as WideOpen, told reporters, "I have to kill myself. This isn't how it was supposed to happen. My BSD has always been the best, and shouting that opinion in other people's faces at every chance I got has been my only hobby. It was all I ever did. It was what got me out of bed in the morning. Now I have to die. I will jam my bedpost up my ass until I hit my brain. It is the only way to go: BSD style."
In the volatile world of operating systems anything can happen. "At least we don't sux0r as much as Windows users", BigAzz, a relatively well-known NetBSD user said. "Screaming things in people's faces is my calling. Now I need to scream that BSD sux0rs. What a sad world. At least I won't kill myself like those uber-asshole OpenBSD guys. They are just way over the top. Or were, at least."
Nobody knows for sure what the future holds for the state of operating systems, but with Netcraft confirming the sux0r status, *BSD users all over the world will have to stick something else up their asses from now on or risk looking even more gay than they used to.
If you aren't looking for network functionality, there's a filesystem called ext3cow that lets you roll back to older versions of the contents of the filesystem.
Any program relying on (nontrivial) preemptive multithreading will be buggy.
RAID is not a backup solution.. arghhh! When will these people learn.
- Essentially the same as Linux' md plus AoE (ATA over Ethernet), which is also built into the kernel, but more modular
- Essentially the same as Linux' md plus Linux' nbd (network block device), which is also built into the kernel, but more modular
- Built into the kernel when it could be a daemon that is notified of changes by the kernel instead (don't know if NetBSD's kernel does this, others do)
- It's not a backup solution but a RAID: It only protects you from disk failures, not from brain failures, something that backup solutions can do
I work at a research institute where we run a cluster of Linux workstations that boot from a server and keep all their data on a file server. Both servers are Linux boxes that use drbd to keep their configuration synchronized. The file server is a set of two front-end machines (one active, one waiting) that additionally use md+nbd to create a network RAID-1 over many back-end nodes. Each back-end node has a RAID-1 of local disks. This means that each disk in each back-end is redundant and each back-end is redundant, so that all data is stored four times instantly upon each write (yes, we do have a separate backup solution), distributed over two separate floors in different parts of the building. And this was implemented by one single totally underpaid student (granted, he's good...)!
So tell me again what's new about TFA.
Now I'm probably Slashdots most nieve reader but...
Couldn't the same be done on any OS using iSCSI and software RAID?
Apparently I'm missing something.
Yay for you. Now go back to masturbating over pictures of tux and stop posting trolls with links that are 5 years old.
Now this is what bugs me. Why do you always assume that it's Linux users trolling you? As you have demonstrated, BSD zealots are so eminently trollable that literally anyone can have a go. Even an MCSE could probably generate tens of angry replies with a cut & pasted troll.
It's quite likely that you are in fact trying to insult a script. Have some self respect for fuck's sake.
until you can write a program in Lua to give me a BJ, speak for yourself