NetBSD's Real-Time Network Backup
jschauma writes "One of NetBSD's developers, der Mouse, was interviewed by DaemonNews about his real-time network backup system (originally presented at BSDCan 2005), where changes to your local filesystem are automatically propagated to a backup server. In his interview der Mouse tells about his idea, how it works, and of course, how cool it is."
But hasn't Sun been doing this with Solaris for at least 3 years?
Any fool can criticise, condemn, and complain, and most fools do. - Benjamin Franklin
Here is the idea behind the setup I am currently using: Easy Automated Snapshot-Style Backups with Linux and Rsync.
I think the point is that it could be used for an off site backup. Raid does not protect you from Hurricanes, or even fires.
This idea is really cool, but implementing it by putting hooks into each device driver seems overly complicated. It also doesn't sound like they're any sort of priority setting for this or any type of data filtering.
Personally I'd like to see something like the MS filesystem in development that allows SQL calls to be run against it (not sure if there's any other filesystems that are similar). Query every 5 minutes for changed data that fits the backup parameters (within the system dir, the user's home dir, certain filetypes) and then transfer the data as the network isn't being used.
That would achieve the same thing, but more flexibly and without affecting normal use.
You have enemies? Good. That means you've stood up for something, sometime in your life. --Winston Churchill
Sounds like it's essentially a DoubleTake daemon for BSD, cool, I wonder how well it scales? Say if you wanted to fully mesh 10 or more servers or something. Sounds like it might come in handy for keeping the content in web farms in synch as well....
Obviously, RAID servers don't help you in the case of accidental deletion. And they certainly don't help if your whole computer gets blow up.
Still, you'd want to be careful with this, it would suck to back up all the temp files generated by random processes.
autopr0n is like, down and stuff.
This concept has been in play for years as a commercial product for Disaster Recovery, Veritas Volume Replicator (VVR).
yes you are missing the point..
take 10 small servers that do the front end grunt work with 2-3 backup servers that keep complete working images of the servers and have access to their data..
a front end server dies service can roll over to a backend until the front is replaced and is quickly made jsut like the orginal a backend dies and you have a second and if all the backups die then you still have the front end to recreate the backups..
you don't normaly consider the bandwith costs as they are typicaly on a highspeed network between them and it offers you the option of replication over diffrent connections and areas..
all redundent disks help with is if a disk dies not if ram or cpu fails
some people have gotten too attached to their physical backups and tapes - personaly a backup is worthless if i can't have live access to it in a few min even if i am not physicaly at the point of failure..
this isn't particulary useful for small setups but is great for mid to large scale setups and offers plenty of room to grow.
'...if only "Jumping to a Conclusion" was an event in the Olympics.'
"Those crazy Germans."
No, that would be "der Maus"
You crazy Americans -- Hier ist der Maus!
"Trolls they were, but filled with the evil will of their master: a fell race..." -- J.R.R. Tolkien on Olog-hai
Those crazy Germans.
According to the article, he's canadian.
I enjoy large posteriors and I cannot prevaricate.
How does this compare with Linux Network Block Device? Sounds very similar.
There are pretty mature commercial tools for this stuff, as well - Veritas' VVR replication comes to mind.
I don't actually run RAID, but I've gotten some interesting stories from some (more than 1) people who do.
In a RAID cabinet, you have a bunch of identical drives, most likely purchased together, too. Then you submit them to an essentially identical environment and operating history. Barring a defect, and assuming wearout-type phenomena, something bad may well happen.
The weakest drive fails first. Power down the RAID box to replace the bad drive, so you can bring it back up and restore the data. The stress of the power-down and restart is enough to kill the second-weakest drive. Now you have to go back to tape, and RAID didn't do squat. This doesn't happen all the time, but it's surprisingly more likely than you'd think - enough so that they've quit using RAID as "backup".
Another alternative would be using different drive models, or finding some other way to change the vintage/history issue. Hotplugging drives while leaving the cabinet up would be another good idea.
The living have better things to do than to continue hating the dead.
Termites, on the other hand...
The weakest drive fails first. Power down the RAID box to replace the bad drive, so you can bring it back up and restore the data.
well, no. enterprise level raid has spinning spares and hotswappable everything. you can lose two drives and still be running as long as you get those replacements in there before number 3 goes. been there, and yes, it happened when we shut down for maintenance. In the real world catastrophic failure happens. Raid is not used as a backup usually, it is used to keep data available in the event of a hard drive failure. That is why you have a tape backup every night of the raid, and an extra set offsite somewhere. We have all heard the phrase, "a backup of the backup".
music lover since 1969
How does this protect against an rm -rf against the filesystem... I guess it would trash the backup on the other side.
Why do you have to settle for one or the other? A proper backup strategy, like any security strategy, should involve more than one technology.
Hotswappable RAID has saved my servers on more than one occasion. Likewise, the servers have also been saved by tape backups. RAID5, tape backups, and data replication all have different pros and cons.
I think it is incorrect to say RAID5 is not acceptable in any backup strategy. The more chances you get at data redundancy, recovery, and failover, the better off your organization.
I don't actually run RAID, but I've gotten some interesting stories from some (more than 1) people who do.
I'll comment on this later...
The weakest drive fails first. Power down the RAID box to replace the bad drive...
OK, this is where I start getting dizzy. If their data is valuable enough to have RAID, why were they such cheap bastards that they didn't get hot-swap drives? I've worked in a LOT of places that have RAID systems, and three of my own servers have RAID, yet to date, none of them were anything but hot-swap. Additionally, with a small amount of intelligence and a few extra dollars, the administrator always puts in a hot-standby drive that will automatically take over if a drive fails, allowing for the failed drive to be replaced at a more convenient time than 1:30am without sacrificing the redundancy. Sysadmins running really critical systems will often have multiple hot-standby drives.
The stress of the power-down and restart is enough to kill the second-weakest drive.
Now, see, here's the funny part. When you spend the bucks for SCA hot-swap drives, you actually get drives of decent enough quality that this is very rarely a problem. Even if you did have to shut the array down, which you won't because you bought proper hardware.
enough so that they've quit using RAID as "backup"
Further evidence of idiocy. RAID is not a backup. RAID allows you to keep running in the event of a specific type of hardware failure. But that is all it protects you from. Backups are still just as critical as they were before you had RAID. Anyone who uses a RAID array instead of proper backups deserves to have their data sacrificed to the gods of entropy, shortly followed by their own careers.
As for my delayed comment on the first sentence... Well, I suggest you get smarter friends.
This is basically RAID over the network. Personally, I can't see a lot of use for it... Just put the second drive in the machine, and use software RAID, rather than putting the second drive in a network server. Less network slowdown and congestion that way, not to mention CPU-time wasted packetizing, encrypting, etc.
As always, RAID (and now this) is not a backup solution.
Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
As of Windows 2003 R2, there is a capability to do a VSS type of thing over the network to a remote server.
I'm a little ashamed that I know that, but it's true.
I also don't see how this solution is effectively any better than RAID... If anything, a backup server is more expensive than a second hard drive for a RAID system (though it may pay off eventually). I'd think the backup server would need to be maintained as well... and if your backup ever fails, it seems like it would require a lot to set up another.
I only skimmed TFA and it's not clear to me how like or unlike Windows' Distributed File System it is, but I'll give you a quick picture of what DFS does for us here to give you a better idea how NetBSD's backup could be handy. We've got a primary and secondary server, each with its own RAID array, and DFS isn't a replacement for it - it's a supplement to it. I'd consider this to be the same.
For starters, when your server fails your RAID array goes with it. The data's fine of course (knock on wood), it's just not available until you either fix the server or shuffle the array into another system. Compound that with the fact that I only drop by here a couple times a week, and I'm the only person who could do this work (we're a small office). When that failure happens, the data would probably be offline for hours at minimum, and that would be a hardship in this environment. Having our data perpetually backed up on another working system that's just waiting to take over is easily worth the trouble and expense of a second system.
In addition, DFS doesn't actually record a duplicate copy of the whole disk's file system (one-way to the backup server), nor does it work in the transactional manner that I picture this working, but it replicates files within a special share both ways. You create this share, and it isn't actually on either server - it's on BOTH servers. DFS decides which one to use and keeps the copies synchronized. If the primary server catches on fire, gets stolen, explodes etc., users would hardly notice. There's a little lag in replication sometimes, so something very recently saved in the primary copy of the share might not actually be in the secondary yet. Aside from that, almost everything else just keeps working.
The bandwidth could be an issue in another environment, but this particular server only gets a mild-to-moderate workout, and DFS is able to keep up. There are a couple database applications that I only allow to replicate one-way because initially DFS started to choke trying to keep it synchronized both ways. For those, someone would have to switch the clients manually from using one server to using the other. Aside from those two, I can reboot either server at will without ever disturbing a user. I think that in the worst case, this is what you'd need to do with NetBSD's backup.
It's almost a troll to even mention it, since there are so many things pioneered by Multics....
Oh no! the rootkit got replicated to the backup server!
So what you have is not only a real-time backup, but also the ability to unwind any possibly damage after a break-in or other event. If your backup server is only running a journalling client, it can be made extremely secure, and also provide an excellent auditing tool.
It would also be possible to delay the streams and have arbitrarily old filesystems available. Or to use a local journal as a buffer to smooth out the IO load as it is piped off site. It could also be used to augment a non-journalling filesystem, for crash recovery purposes, assuming your filesystem provides at least some consistency guarantees. In fact Netapp does something similar by logging FS operations to NVRAM, while the filesystem only writes consistentcy points periodically.
Although I haven't read about the NetBSD work, I am sceptical that they could get the error handling to work correctly at that level. With the DragonFly journalling, there is support for transactional consistency, as well as recovering from interrupted network connections. While it is not complete, much of it is in place and functional.
See the mountctl manual page for attaching a journal to a mount, and the jscan manual page for processing the journal.
In linux a RAID array can contain any block devices, including network block devices, ramdisks, whatever.
(I read this in a linux software RAID tutorial once)
NetApp has been doing somthing similar for a very long time. A lot of people use the Sun boxes on the frontend to boot or attach to the storage appliance and let it do the backups. It saves space and saves the server from having to do it.
Actually, he's been around as Der Mouse since I was in college (circa 1985). I ran the xterm replacement he wrote back then. Long, long before Theo had his hissy fit and forked off OBSD. Of course, a trivial Google would have shown that, but hey, an AC would want to miss out on an ad homen flame...