Slashdot Mirror


NetBSD's Real-Time Network Backup

jschauma writes "One of NetBSD's developers, der Mouse, was interviewed by DaemonNews about his real-time network backup system (originally presented at BSDCan 2005), where changes to your local filesystem are automatically propagated to a backup server. In his interview der Mouse tells about his idea, how it works, and of course, how cool it is."

166 comments

  1. Correct me if I'm wrong by thedletterman · · Score: 5, Interesting

    But hasn't Sun been doing this with Solaris for at least 3 years?

    --
    Any fool can criticise, condemn, and complain, and most fools do. - Benjamin Franklin
    1. Re:Correct me if I'm wrong by operagost · · Score: 4, Interesting

      OpenVMS has been doing this for even longer using volume shadowing.

      --

      Gamingmuseum.com: Give your 3D accelerator a rest.
    2. Re:Correct me if I'm wrong by vertinox · · Score: 3, Funny

      But hasn't Sun been doing this with Solaris for at least 3 years?

      Yes, but do you want to sell your children and a kidney for a Solaris server?

      --
      "I am the king of the Romans, and am superior to rules of grammar!"
      -Sigismund, Holy Roman Emperor (1368-1437)
    3. Re:Correct me if I'm wrong by sharpone · · Score: 1

      Only if you take both of them... the children that is ;)

    4. Re:Correct me if I'm wrong by Anonymous Coward · · Score: 0

      How do we know Apple didn't invent this?

      I think Steve Jobs announced this last week, right after the new iPod Workstation in faux leather.

    5. Re:Correct me if I'm wrong by Anonymous Coward · · Score: 1, Interesting

      I run solaris on non-sun x86 server hardware. For what we do, none of the linux distributed clustering solutions are as good as solaris.

    6. Re:Correct me if I'm wrong by Anonymous Coward · · Score: 0

      >Yes, but do you want to sell your children and a kidney for a Solaris server?

      You know that Solaris can be downloaded for free right? Maybe you have to sell children and your kidney for your ISP, but I just go down to the local coffee shop w/ wireless when I need to get me some free CD images.

    7. Re:Correct me if I'm wrong by kjs3 · · Score: 1

      Oh...you're one of those folks for whom time is free. Carry on.

    8. Re:Correct me if I'm wrong by Anonymous Coward · · Score: 0

      What?

      How are you going to use a computer, if you don't want to install an OS? Even if you have Windows XP, you still have to install it.

      If your PC or server came preconfigured, you exchanged your money for time. In that case, you might as well have bought a preconfigured solaris-box.

      It's simple.

    9. Re:Correct me if I'm wrong by kjs3 · · Score: 1

      Umm...exactly my point. Just because you can download Solaris/Linux/BSD or whatever for "free", it's not "free" in an economic sense to use it. You still have to install & support it, and for those of us that measure such things, that's where the majority of the expense is.

    10. Re:Correct me if I'm wrong by m750 · · Score: 1

      it works on solaris 10 x86. AO -- supports said sun product

      --
      www.underonesky.com
    11. Re:Correct me if I'm wrong by Anonymous Coward · · Score: 0

      I used to think that was the case until I reached the point in my job where I started doing those computations.

      Just today I was pricing out a new project and the initial hardware cost would pay for the employee cost of our entire IT shop (about 50 people) for 10 years. Now consider that it will only take 4 FTE to run/manage the hardware (actually two, but we over-allocate staff for vacations, etc.) and the person-time is a small part of the project.

      Heck, the yearly maintenance on the hardware / software dwarfs the person time.

      We used to run Linux on all the servers. Yes, there was more time involved. The decision to go with proprietary hardware, OS, and database certainly provides a sense of security to management. Admittedly, it also has provided me with some nice training (something never paid for with our free/open source software) and it is sometimes a benefit to make the "help!" calls those maintenance contracts provide.

      Still, those things are "insurance." Economically, I'm not sure it's worth it.

      As a final note, our longest service outage was with the existing proprietary system. We had failures under our open source infrastructure, but the bare-metal recovery or fail over were more quickly resolved than working with a vendor trying to find out why a vendor blessed configuration of a redundant system with built-in fail-over that should never fail did.

      Us: Just give us a new part and we'll re-install and restore.
      Them: We can't release those parts from the depot until we've confirmed the cause of the problem.
      Us: But we have a four-hour parts replacement commitment on the hardware support contract.
      Them: You have a better chance of winning the lottery than this happening. Are you sure its plugged in.
      Us: ...

      At least all that money gives you a lot of ass kissing when they provide the engineering forensics explaining why the high-dollar-never-fails components didn't fail-over properly.

  2. B.S. D? by ExE122 · · Score: 1, Interesting

    So we could have backup servers all over the world keeping track of disk write commands...

    This is indeed very neat, but isn't it sorta how transactional databases have been working?

    I also don't see how this solution is effectively any better than RAID... If anything, a backup server is more expensive than a second hard drive for a RAID system (though it may pay off eventually). I'd think the backup server would need to be maintained as well... and if your backup ever fails, it seems like it would require a lot to set up another.

    There also seem to be a lot of limitations as far as network security, filesystems, encrypted files, etc go. Furthermore, I don't see how the bandwidth hit is worth it (though I guess that depends on where your priorities are).

    Admittedly, I'm no expert on this topic... so am I totally missing something?

    --
    Capitalism: When it uses the carrot, it's called democracy. When it uses the stick, it's called fascism.
    1. Re:B.S. D? by ThePiMan2003 · · Score: 5, Insightful

      I think the point is that it could be used for an off site backup. Raid does not protect you from Hurricanes, or even fires.

    2. Re:B.S. D? by Amouth · · Score: 4, Insightful

      yes you are missing the point..

      take 10 small servers that do the front end grunt work with 2-3 backup servers that keep complete working images of the servers and have access to their data..

      a front end server dies service can roll over to a backend until the front is replaced and is quickly made jsut like the orginal a backend dies and you have a second and if all the backups die then you still have the front end to recreate the backups..

      you don't normaly consider the bandwith costs as they are typicaly on a highspeed network between them and it offers you the option of replication over diffrent connections and areas..

      all redundent disks help with is if a disk dies not if ram or cpu fails

      some people have gotten too attached to their physical backups and tapes - personaly a backup is worthless if i can't have live access to it in a few min even if i am not physicaly at the point of failure..

      this isn't particulary useful for small setups but is great for mid to large scale setups and offers plenty of room to grow.

      --
      '...if only "Jumping to a Conclusion" was an event in the Olympics.'
    3. Re:B.S. D? by dpilot · · Score: 2, Interesting

      I don't actually run RAID, but I've gotten some interesting stories from some (more than 1) people who do.

      In a RAID cabinet, you have a bunch of identical drives, most likely purchased together, too. Then you submit them to an essentially identical environment and operating history. Barring a defect, and assuming wearout-type phenomena, something bad may well happen.

      The weakest drive fails first. Power down the RAID box to replace the bad drive, so you can bring it back up and restore the data. The stress of the power-down and restart is enough to kill the second-weakest drive. Now you have to go back to tape, and RAID didn't do squat. This doesn't happen all the time, but it's surprisingly more likely than you'd think - enough so that they've quit using RAID as "backup".

      Another alternative would be using different drive models, or finding some other way to change the vintage/history issue. Hotplugging drives while leaving the cabinet up would be another good idea.

      --
      The living have better things to do than to continue hating the dead.
    4. Re:B.S. D? by topical_surfactant · · Score: 5, Funny
      Raid does not protect you from Hurricanes, or even fires.

      Termites, on the other hand...

    5. Re:B.S. D? by hawicz · · Score: 1

      It's better than RAID in that if your entire main server gets toasted (e.g. literally, your house burns down, or similar), you've still got a backup.

      From what I gathered in the article, it isn't similar to a database's logs because it just mirrors the writes, without saving the old disk image. Although it sounds like it would be fairly easy to just save the packet stream and have the ability to replay your disk image to any particular moment in the past.
      That would be much better than RAID. After all, backups aren't just to protect against drives failing; they're also to protect against accidental "rm -rf"'s in the wrong place.

      As for setting up a new backup server, if it's just a app listening on the backup server, it shouldn't be any more difficult than installing any other application.

    6. Re:B.S. D? by C10H14N2 · · Score: 1

      We have about 20 SQL servers around the country connected via leased-line T1s originally designed to be constantly replicating to HQ. Not a huge system, but about 2TB of data total. One can imagine the kind of bandwidth such a redundant system sucks up. The cost and performance hits associated with this are absolutely extraordinary and there simply is no reason for about 90% of load.

      It's kind of like the SunRay system. Yeah, the idea is neat, but the architecture is a network-clogging, CPU-leeching nightmare and there is an ample supply of alternatives that have nowhere near the same scaling issues.

    7. Re:B.S. D? by ctr2sprt · · Score: 1
      It's essentially an append-only remote filesystem. That comes with both benefits and drawbacks. The fundamental benefit is point-in-time recovery. Coincidental benefits include dramatically lower average throughput (since backups are always happening) and the potential for lower total backup bandwidth (if a 5GB log file gets 200MB of new entries, an incremental would have to back up the entire 5.2GB log file; a log-based backup system would only back up the new 200MB). It would probably also make the backup admin's life easier by making backup traffic far more constant, so he can better plan how fast his hardware needs to be.

      The downside is that this can end up taking much more space (if you delete and recreate a 50MB file ten times a day, the entire file is backed up ten times versus just once with a typical daily-incremental scheme). Restores can also be much slower depending on how often checkpointing is done. Doing checkpoints too often can defeat the advantages of logging, but done too infrequently the backup server will have to analyze an entire week's worth of data just to restore one file.

      I do think "backup logging" is the way of the future. The advantages are too important for it not to be. It just may take a few more years before we're really ready for it, in large part because it requires significant cooperation from the operating system to be done well. Even filesystem snapshots are still somewhat immature and underutilized by OTS backup solutions, and we've had those for years and years.

    8. Re:B.S. D? by LWATCDR · · Score: 1

      1. Point in time recovery. This allows you to restore back to some point in the past. Good for recovering deleted files.
      2. Off site backups. A second server located at another office just in case of Earthquake, Fire, Flood, Hurricane, Tornado, or some other disaster.
      Raid doesn't replace backups. With encryption you could keep your backup server at a co-location facility, branch office, or home. Handy if the worst does happen.
      Another option would be for a local consulting firm offer this as part of there service. Get an SDSL line per customer and keep a backup at your office for a monthly fee.

      --
      See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
    9. Re:B.S. D? by djdavetrouble · · Score: 2, Insightful

      The weakest drive fails first. Power down the RAID box to replace the bad drive, so you can bring it back up and restore the data.

      well, no. enterprise level raid has spinning spares and hotswappable everything. you can lose two drives and still be running as long as you get those replacements in there before number 3 goes. been there, and yes, it happened when we shut down for maintenance. In the real world catastrophic failure happens. Raid is not used as a backup usually, it is used to keep data available in the event of a hard drive failure. That is why you have a tape backup every night of the raid, and an extra set offsite somewhere. We have all heard the phrase, "a backup of the backup".

      --
      music lover since 1969
    10. Re:B.S. D? by PartialInfinity · · Score: 5, Insightful

      Why do you have to settle for one or the other? A proper backup strategy, like any security strategy, should involve more than one technology.

      Hotswappable RAID has saved my servers on more than one occasion. Likewise, the servers have also been saved by tape backups. RAID5, tape backups, and data replication all have different pros and cons.

      I think it is incorrect to say RAID5 is not acceptable in any backup strategy. The more chances you get at data redundancy, recovery, and failover, the better off your organization.

    11. Re:B.S. D? by Desert+Raven · · Score: 5, Insightful

      I don't actually run RAID, but I've gotten some interesting stories from some (more than 1) people who do.

      I'll comment on this later...

      The weakest drive fails first. Power down the RAID box to replace the bad drive...

      OK, this is where I start getting dizzy. If their data is valuable enough to have RAID, why were they such cheap bastards that they didn't get hot-swap drives? I've worked in a LOT of places that have RAID systems, and three of my own servers have RAID, yet to date, none of them were anything but hot-swap. Additionally, with a small amount of intelligence and a few extra dollars, the administrator always puts in a hot-standby drive that will automatically take over if a drive fails, allowing for the failed drive to be replaced at a more convenient time than 1:30am without sacrificing the redundancy. Sysadmins running really critical systems will often have multiple hot-standby drives.

      The stress of the power-down and restart is enough to kill the second-weakest drive.

      Now, see, here's the funny part. When you spend the bucks for SCA hot-swap drives, you actually get drives of decent enough quality that this is very rarely a problem. Even if you did have to shut the array down, which you won't because you bought proper hardware.

      enough so that they've quit using RAID as "backup"

      Further evidence of idiocy. RAID is not a backup. RAID allows you to keep running in the event of a specific type of hardware failure. But that is all it protects you from. Backups are still just as critical as they were before you had RAID. Anyone who uses a RAID array instead of proper backups deserves to have their data sacrificed to the gods of entropy, shortly followed by their own careers.

      As for my delayed comment on the first sentence... Well, I suggest you get smarter friends.

    12. Re:B.S. D? by MonkeyOfRage · · Score: 2, Interesting

      I also don't see how this solution is effectively any better than RAID... If anything, a backup server is more expensive than a second hard drive for a RAID system (though it may pay off eventually). I'd think the backup server would need to be maintained as well... and if your backup ever fails, it seems like it would require a lot to set up another.

      I only skimmed TFA and it's not clear to me how like or unlike Windows' Distributed File System it is, but I'll give you a quick picture of what DFS does for us here to give you a better idea how NetBSD's backup could be handy. We've got a primary and secondary server, each with its own RAID array, and DFS isn't a replacement for it - it's a supplement to it. I'd consider this to be the same.

      For starters, when your server fails your RAID array goes with it. The data's fine of course (knock on wood), it's just not available until you either fix the server or shuffle the array into another system. Compound that with the fact that I only drop by here a couple times a week, and I'm the only person who could do this work (we're a small office). When that failure happens, the data would probably be offline for hours at minimum, and that would be a hardship in this environment. Having our data perpetually backed up on another working system that's just waiting to take over is easily worth the trouble and expense of a second system.

      In addition, DFS doesn't actually record a duplicate copy of the whole disk's file system (one-way to the backup server), nor does it work in the transactional manner that I picture this working, but it replicates files within a special share both ways. You create this share, and it isn't actually on either server - it's on BOTH servers. DFS decides which one to use and keeps the copies synchronized. If the primary server catches on fire, gets stolen, explodes etc., users would hardly notice. There's a little lag in replication sometimes, so something very recently saved in the primary copy of the share might not actually be in the secondary yet. Aside from that, almost everything else just keeps working.

      The bandwidth could be an issue in another environment, but this particular server only gets a mild-to-moderate workout, and DFS is able to keep up. There are a couple database applications that I only allow to replicate one-way because initially DFS started to choke trying to keep it synchronized both ways. For those, someone would have to switch the clients manually from using one server to using the other. Aside from those two, I can reboot either server at will without ever disturbing a user. I think that in the worst case, this is what you'd need to do with NetBSD's backup.

    13. Re:B.S. D? by towsonu2003 · · Score: 1
      second hard drive for a RAID system
      I would prefer giving my second hard drive to the Ancients rather than the Wraith... That's my opinion of course.
    14. Re:B.S. D? by MonkeyOfRage · · Score: 1

      I've heard this before, and I've always found it just a tad far-fetched. Even with drives made the same day with the same batches of components I would expect a little more variation in their lifespans. I suppose it could be true, but I've never experienced it firsthand or met someone who did.

      What used to be a problem though was the power demand when all the drives spin up - you could replace a drive and then kill the power supply when you turned it back on. Dying power supplies can take components with them, so I'd find it credible that maybe you'd lose a second drive this way with a cheap power supply. I would hope that any modest cabinet today would have a better power supply than that, though.

      I haven't bought a SCSI drive in years, but it was common at the time to be able to configure a spin-up delay on SCSI drives for exactly this reason. You could stagger their startups to avoid having them all hammering the power supply at once. Obviously, if you could dodge that bullet, there's no risk of it taking a drive or two with it even if it is a cheapie.

    15. Re:B.S. D? by Anonymous Coward · · Score: 0

      LMFAO

    16. Re:B.S. D? by MonkeyOfRage · · Score: 1

      If their data is valuable enough to have RAID, why were they such cheap bastards that they didn't get hot-swap drives?

      I've been hearing this story since at least the early 90's. Hot swap capability wasn't as common then as it appears to be today. I don't recall numbers, but I recall that the cabinet that I would have liked cost more than the rest of our little network.

    17. Re:B.S. D? by vertinox · · Score: 1

      If anything, a backup server is more expensive than a second hard drive for a RAID system (though it may pay off eventually).

      Unless you have hotswap ability, if a hard drive fails, you still have to power it down to remove it. If you have a second server up and running, you won't have any downtime other than changing an IP address.

      Sure it might be only 20 minutes to swap hard drives tops, but a server down during business hours is still a pain.

      --
      "I am the king of the Romans, and am superior to rules of grammar!"
      -Sigismund, Holy Roman Emperor (1368-1437)
    18. Re:B.S. D? by mattyrobinson69 · · Score: 2, Informative

      In linux a RAID array can contain any block devices, including network block devices, ramdisks, whatever.

      (I read this in a linux software RAID tutorial once)

    19. Re:B.S. D? by dpilot · · Score: 1

      They had backups - that was even in my comment.
      But it can be a pain to restore from tape, and the data is offline while you do so.
      The idea of the RAID was to not have downtime, and some non-trivial amount of the time, it just doesn't succeed in that.
      They weren't stupid, just inconvenienced, and their SLA took a hit.

      --
      The living have better things to do than to continue hating the dead.
    20. Re:B.S. D? by kv9 · · Score: 1

      you dont have to use it for everything. i, for example, would use it for some small databases and my svn repos. if your box dies between the nightly backups, it aint such a big deal that you lost, say, a days worth of apache logs and mail. but it would suck if you lost a days worth of orders or that code you cranked out furiously over lunch. i say, prioritize!

    21. Re:B.S. D? by brianosaurus · · Score: 1

      That may be true, but try having one node of your RAID array located offsite and see how blazingly slow it is.

      RAID is redundancy, not backup.

      --
      blog
    22. Re:B.S. D? by jbplou · · Score: 1

      I think it is incorrect to say RAID5 is not acceptable in any backup strategy
      I think the problem is that the RAID is not a back solution at all. RAID is a redundancy solution. RAID won't let you roll back to a point of time if there is a massive configuration problem or an admin deletes all your files. Nor will RAID do anything for you if your server dies outside of the disks. No RAID-1, 5, 10 just provide you with disk redundancy. This is not a backup solution; this is for maximum uptime and also can provide performance advantages.

    23. Re:B.S. D? by PartialInfinity · · Score: 1

      Yup. Which is why I said it is part of the overall backup strategy. The previous poster was suggesting that people he knew were abandoning RAID5 but I was saying that RAID5 has its place in the overall scheme.

      The value of backups decline along with the credibility and availability of the data to be backed up.

    24. Re:B.S. D? by Anonymous Coward · · Score: 0

      Me too, seen it happen, couldn't believe it either, but the theory that is given to explain the phenomenon makes sense, and the phenomenon is what prompted the theory, not the other way around.

      But for me this just means use RAID6, not RAID5. Have two hot spares, not just one. Have abundant cold spares, not just however many you think you can get away with. Buy disks in small batches so that they are sufficiently predictable, but not all identical. And above all, buy the best quality SCSI platters you can get and not the cheapest GB/$$$ IDE scrap metal available. Do I need to mention any brand name?

      And this is orthoganal to the topic, which is realtime network backup, not online drive failover via local drive redundancy. One keeps you up, the other is for having something to put up after you fall over.

      Goodbye, and good luck.

      Monster-of-God
      Theyarecomingtogetyou. Thisisnotasig.

    25. Re:B.S. D? by Anonymous Coward · · Score: 0

      or more likely, use RAID1/0 or something .. i don't even remember what RAID6 is to be honest, but you get what I mean. Increment numbers everywhere. It's good for you, and it's good fun. This is your mother talking.

      Monster-of-God
      Noit'snotasiggetreal

    26. Re:B.S. D? by Anonymous Coward · · Score: 0

      Did you read the post about hot-swappable drives yet?

      Anyone who mentions the term "RAID" in a data-integrity context, and doesn't mean "hotswappable SCSI" is a troll. Man, good SCSI platters that are NOT in RAID are more reliable and faster than RAIDED IDE. However they seem to cost more if you are illiterate wrt the actual use of the things.

      Have a look how long an IDE RAID takes to rebuild, assuming you actually are using the amount of cheap storage you have so addictively purchased, and you will start to become enlightened. Rebuild time is a real factor if the RAIDism is being used for data integrity, and rebuild time is a function of disk speed vs storage capacity.

      IDE is as cheap as IDE is *because* it has large capacity relative to the speed of the medium. This is why it is being sold much cheaper than SCSI, and why only SCSI is really the only credible option for keeping data reliably accessible (which is why IDE is as cheap as IDE is ... and SCSI is overpriced .. the one subsidises the other .. IDE is a byproduct of the SCSI industry .. IDE is SCSI that can't pass QA, and so is sold cheap .. both are necessary and useful, but only one is meant for 24x7 accessible data).

      InfernoInvestigates

    27. Re:B.S. D? by thedletterman · · Score: 1

      "I don't see how the bandwidth hit is worth it " This is like the third comment I've seen on this subject. What bandwidth hit? We're talking internal traffic, like I've got 3 tv channel broadcasts going out to whoever wants to watch TV at thier desk.. I don't see how people mirroring their spreadsheets everytime they click save into a central server is going to bring the house down.
      I mean shit, even if you had a workstation downloading porn as fast as it could from usenet... their available internal bandwidth should trump their available external bandwidth a hundred fold.

      --
      Any fool can criticise, condemn, and complain, and most fools do. - Benjamin Franklin
    28. Re:B.S. D? by thedletterman · · Score: 1
      RAID5 is not a backup strategy, it's disk redundancy. What happens when your powersupply goes apeshit and that box dissapears in a puff of smoke? Or Mr. Pimpletech is rolling across his new Dell 4way on a dolly and smashes the corner of your box and crashes three disk heads corrupting the entire array?

      Mr. CEO shouts, "I thought you were backing up our finance server!"
      You try to explain, "Well I had RAID, so the disks backed each other up. It was more convenient than popping a tape in and out on a daily basis."
      Guess what data is going to be missing from finance next?
      Your payroll info.

      --
      Any fool can criticise, condemn, and complain, and most fools do. - Benjamin Franklin
    29. Re:B.S. D? by PartialInfinity · · Score: 1

      I think you're completely missing the point. I know RAID5 is not a backup strategy -- however, it is a PART of the overall backup strategy. There is a subtle but concrete distinction that I am making and I think you are failing to grasp it. Please re-read this thread more carefully.

    30. Re:B.S. D? by dpilot · · Score: 1

      So the answer is obvious, use the Ultimate - RAID 11! (Others' RAID only goes to 10)

      --
      The living have better things to do than to continue hating the dead.
    31. Re:B.S. D? by shaitand · · Score: 1

      So is this. The difference between redundancy and backup is the ability to step back to a previous state. With a redundant system all writes are immediately copied or nearly so and you lose the ability to retrieve a file you just deleted, this shares that characteristic.

    32. Re:B.S. D? by shaitand · · Score: 1

      "front end server dies service can roll over to a backend until the front is replaced and is quickly made jsut like the orginal"

      This is a pretty shakey solution at best. You could change the default bootloader config and tell the machine to reboot using the chosen image but if there were the slightest problem that required a tweak on the system it may never come up and give network access again. If you replicate using a hybrid system where some portion of the configuration is already setup on the backup server and just the portions that vary are backed up from the various production servers then you increase the possibility of complications as well. The biggest problems do not come from technical limitations but from unforseen consequences and oversights during configuration.

    33. Re:B.S. D? by slazzy · · Score: 1

      or physical machine theift - don't laugh it happens

      --
      Website Just Down For Me? Find out
    34. Re:B.S. D? by Anonymous Coward · · Score: 0

      What a clueless idiot!

      Please explain to me how termites protect you from hurricanes or fires.

  3. Neat. by Pig+Hogger · · Score: 3, Interesting
    This is definitely the way to go. With huge hard-disks that offer capacities beyond tape drives, it is less and less feasible to use traditionnal tape-based backup systems in many organizations, if only by the time taken by the frigging tape drive...

    Here is the idea behind the setup I am currently using: Easy Automated Snapshot-Style Backups with Linux and Rsync.

    1. Re:Neat. by Lord+of+Ironhand · · Score: 4, Interesting

      I prefer Dirvish, and I highly recommend that people looking for a good harddisk-based backup system take a look at it. I've looked long and hard for a good backup system and this is the first that seems to fit the bill for me.

    2. Re:Neat. by daniel_newton · · Score: 1

      rsnapshot (http://freshmeat.net/projects/rsnapshot/) packages mike rubels concept into an easy to use package, I found some red-hat rpms somewhere too.. it works great on our server

    3. Re:Neat. by geniusj · · Score: 1

      Looks neat :-). I use a similar mechanism for backups of one of my boxes. Another one, however, uses backup space that I only have FTP, SCP, SFTP and rsync access to (no other shell commands), for that I use Duplicity, which is very clever. It even encrypts your backups using gpg.

      I should look into something like dirvish though to replace my current homemade 'backupd' which basically does the same thing with less flexibility.

    4. Re:Neat. by Anonymous Coward · · Score: 0

      Has anyone modified this method to use inotify instead of cron to determine when files have changed and thus incremental them per change rather than per time period? I could easily see this sort of thing becoming a killer app for the linux desktop. Never lose your family photo album again!

    5. Re:Neat. by sloth+jr · · Score: 1

      If by "the way to go" you mean, hard-disk-based backups, then I'd agree with you. In this particular case, this acts as a poor-man's replacement for RAID-1 (mirroring), with the same problems inherent in that system that make it unsuitable for general backups. Consider a simple command - "rm -rf s *". Ooops! With a point-in-time backup, you're not necessarily SOL, though of course, you weigh that against the data lost between your backups.

    6. Re:Neat. by Bombcar · · Score: 1

      Well, the largest LTO 3 drives offer 400 GB uncompressed per tape, at 80 MB/s native transfer rate, which isn't too shabby.

    7. Re:Neat. by innate · · Score: 1

      Does anyone know how Dirvish compares to rsnapshot?

      --
      No, I don't want to explore the Recycle Bin.
  4. How is this different from Windows VSS? by YU+Nicks+NE+Way · · Score: 1, Interesting

    Volume shadow storage is exactly this kind of incremental, real-time backup process. How does this differ technically from that? (Other than the fact that you can now dynamically back up your morning toast, which is useful if a slice goes up in flames...)

    1. Re:How is this different from Windows VSS? by ROOK*CA · · Score: 1

      If I'm not mistaken VSS doesn't work across a network and VSS stores the snapshots on the same volume as the original data.

    2. Re:How is this different from Windows VSS? by Anonymous+Struct · · Score: 3, Informative

      As of Windows 2003 R2, there is a capability to do a VSS type of thing over the network to a remote server.

      I'm a little ashamed that I know that, but it's true.

    3. Re:How is this different from Windows VSS? by ROOK*CA · · Score: 1

      As of Windows 2003 R2, there is a capability to do a VSS type of thing over the network to a remote server.

      I'm a little ashamed that I know that, but it's true.

      Really?...learn something new everyday, thanks for the tip and of course knowing something isn't anything to be ashamed of. ;)

    4. Re:How is this different from Windows VSS? by ScrewMaster · · Score: 1

      True ... but ignorance of Windows is a welcome form of bliss to many of us.

      --
      The higher the technology, the sharper that two-edged sword.
  5. Good idea, but there has to be a better way by TheFlyingGoat · · Score: 3, Interesting

    This idea is really cool, but implementing it by putting hooks into each device driver seems overly complicated. It also doesn't sound like they're any sort of priority setting for this or any type of data filtering.

    Personally I'd like to see something like the MS filesystem in development that allows SQL calls to be run against it (not sure if there's any other filesystems that are similar). Query every 5 minutes for changed data that fits the backup parameters (within the system dir, the user's home dir, certain filetypes) and then transfer the data as the network isn't being used.

    That would achieve the same thing, but more flexibly and without affecting normal use.

    --
    You have enemies? Good. That means you've stood up for something, sometime in your life. --Winston Churchill
    1. Re:Good idea, but there has to be a better way by jcgf · · Score: 3, Interesting
      A hook into each driver does seem like a strange way to do this, you would think that it could be done once at a higher level.

      Query every 5 minutes for changed data that fits the backup parameters (within the system dir, the user's home dir, certain filetypes) and then transfer the data as the network isn't being used.

      Then you loose the realtimeness.

    2. Re:Good idea, but there has to be a better way by Anonymous Coward · · Score: 0

      How about something like the new inotify that is in the linux kernel from 2.6.12 on? You can get notified (in a userspace tool) for all kinds of actions on inodes (files and directories). Set it up to listen for all modifications (writes/creates/deletes) on a target, say /data/importantstuff/ and then do something like you are talking about for a backup program.

    3. Re:Good idea, but there has to be a better way by ivoras · · Score: 4, Interesting
      This idea is really cool, but implementing it by putting hooks into each device driver seems overly complicated.
      FreeBSD's GEOM is solving that: http://www.bsdcan.org/2004/papers/geom.pdf

      Also, there's "GEOM gate" on FreeBSD: http://garage.freebsd.pl/GEOM_Gate.pdf
      For other cool stuff with GEOM see here and here. See also this discussion thread about ggate's limits.

      --
      -- Sig down
    4. Re:Good idea, but there has to be a better way by ROOK*CA · · Score: 1

      Query every 5 minutes for changed data that fits the backup parameters (within the system dir, the user's home dir, certain filetypes) and then transfer the data as the network isn't being used.

      Unless I'm reading you wrong here, with a 5 minute delay you can already do this with rsync, a shell script and a cron job. According to the article this guy is doing it in near real time across the network (from what I can tell) by intercepting the write calls to the file system driver(s).

      Not sure how else you could do it with involving hooks into the drivers themselves, unless you have really frequent polls to the file system to check for changes which seems to me would be very expensive.

      Just a thought...

    5. Re:Good idea, but there has to be a better way by geniusj · · Score: 1

      Or the kevent framework that's been in FreeBSD for a long time now. I'm pretty sure that can accomplish this as well. But someone correct me if I'm wrong.

    6. Re:Good idea, but there has to be a better way by Anonymous Coward · · Score: 0

      Then you loose the realtimeness.

      It's lose, dumbass.

    7. Re:Good idea, but there has to be a better way by disappear · · Score: 2, Funny

      Actually, he's letting the realtimeness (whatever in the heck THAT is) out of its cage, letting it loose.

      That it runs away and hides where nobody can find it. So he also does lose it, but only because he loosed it.

    8. Re:Good idea, but there has to be a better way by Jack9 · · Score: 2, Interesting

      Device drivers would be the best solution for me. I want an exact copy of what I wrote to a physical drive. Hook, encrypt, send to another HD to repeat. Realtime, low-level. This allows it to be relatively fast (as opposed to having to process through layers of abstraction), accurate (as opposed to something an abstraction might do to it), and realtime...

      I want dual transactions. 1 for onsite and 1 for offsite. I'm not even interested in encrypting the data. I need to be able to kill my onsite immediately and failover to the offsite with a simple endroute change. I need to be as realtime as possible...Why would I want a 5 min backup? I can get near-realtime NOW with many of the systems in this thread; I just want nearer.

      --

      Often wrong but never in doubt.
      I am Jack9.
      Everyone knows me.
    9. Re:Good idea, but there has to be a better way by MonkeyOfRage · · Score: 2, Insightful

      No political party has a monopoly on wisdom or ignorance

      No, but when it isn't frustrating, it's hysterical watching them try to corner the market.

    10. Re:Good idea, but there has to be a better way by h4ck7h3p14n37 · · Score: 1

      That was my first thought after reading the article. It would be much simpler to write a single geom class to handle this than to muck with a bunch of device drivers.

      Was geom ever ported to NetBSD?

    11. Re:Good idea, but there has to be a better way by mellon · · Score: 1

      You can already do this - just do a recursive descent of the filesystem tree. SQL is just an interface for doing the same thing. It may be that the MS filesystem is more efficiently organized for doing this kind of query, but that's another issue.

    12. Re:Good idea, but there has to be a better way by ivoras · · Score: 2, Informative

      Sadly, no, not even to DragonflyBSD. Don't know why, but maybe it's because it uses kernel threads internally...

      --
      -- Sig down
    13. Re:Good idea, but there has to be a better way by Nikker · · Score: 1

      All this means is, when the call occurs to save data to disk the call not only writes to the primary disk but as well to a network device in parallel. Almost like a network RAID setup, realtime only means that the writes are "requested in parallel" as network latency will be much higher then on device storage. For mission critical the data to be saved would have to remain in memory until it can be verrified that it has been successfully stored at both points which should really be done regardless.

      As for polling evrey Xmins it may not work as well as during busy times of day you may have massive amounts of data in queue (just as much memory needed to short term store it while waiting to be written). Proformance will also get hurt as the stack gets flushed interrupting currently running processes and will continue to do so until the built-up data is sent and verrified. Your server will be stalled in intervals causing catchup conditions. The data being delt with as encountered allows better usage of intermediate storage off the main machine to help the bottle neck if saving over slower networks (i.e. VPN over INet) of course depending on your connection to the destination ;)

      --
      A loop, by its nature, continues. If that didn't make sense, start reading this sentence again.
    14. Re:Good idea, but there has to be a better way by g1zmo · · Score: 1

      It's the female of the species - like a lioness. I think the Saharan Realtimes are on the endangered list.

      --
      I have found there are just two ways to go.
      It all comes down to livin' fast or dyin' slow.
      -REK, Jr.
    15. Re:Good idea, but there has to be a better way by Anonymous Coward · · Score: 0

      I dunno BSD but linux has fnotify (or pnotify?), like MacOS has whatever facilitates spotlight now, and BeOS had whatever it had that takes a notify list and proactively returns hits on those files/dirs whenever they are altered in whatever way you requested notification of (I guess .. I forgot those kinds of details).

      So on linux, you could use fnotify/pnotify to have some process start up and copy/rsync your files whenever they are modded, with pretty much no overhead of the type you described, pretty much in realtime, and no need for new device driver hacks.

      But I've never done it myself ... so far I haven't seen the need :)

      The nice idea about waiting 5 minutes (or perhaps even longer) after the changes before syncing them (yes, I'd only sync files that had changed *more than* 5 minutes ago) is that you don't end up copying 10 tonnes of tmp files that were created and destroyed within an instant during say some rsync operation that you were doing incidentally between two other unrelated directories .. (or in a windows analogy, the example could be the tmp files created when installing some big app that are ultimately just copied and then deleted).

      These types of savings are valuable, else we may as well use thin clients :) But I guess in this scheme, you would designate not to sync /tmp, and hope everyone only ever creates tmp files in /tmp. Or whatever..

      BigGod

    16. Re:Good idea, but there has to be a better way by animus9 · · Score: 1

      I'm pretty sure I recall reading that Matt didn't think GEOM solved the problem in the most desirable way, consequently nobody has tried to port it over. Although, don't quote me on it.

      --
      I eat bees -- they taste stingy.
    17. Re:Good idea, but there has to be a better way by phoenix_rizzen · · Score: 1

      On FreeBSD, you just use GEOM Gate (ggated and ggatec) to create a network filesystem/partition. Then you use GEOM Mirror (gmirror) to create a RAID1 array using the local disk and the ggate disk. The GEOM disk layer handles everything for you from there on. No special driver hooks required, works with any and all disks.

      If you want to get fancy, you could use ggate to create two network disks/partitions, and graid3 to create a RAID3 array. But the performance probably would be all that great. :)

    18. Re:Good idea, but there has to be a better way by deKernel · · Score: 1

      You might want to check out the DragonFly website. The approach that they are taking is completely different. I would attempt to explain, but I am pretty sure I would show off my stupidity.

  6. DoubleTake by ROOK*CA · · Score: 2, Insightful

    Sounds like it's essentially a DoubleTake daemon for BSD, cool, I wonder how well it scales? Say if you wanted to fully mesh 10 or more servers or something. Sounds like it might come in handy for keeping the content in web farms in synch as well....

  7. accidental deletion? by autopr0n · · Score: 2, Insightful

    Obviously, RAID servers don't help you in the case of accidental deletion. And they certainly don't help if your whole computer gets blow up.

    Still, you'd want to be careful with this, it would suck to back up all the temp files generated by random processes.

    --
    autopr0n is like, down and stuff.
    1. Re:accidental deletion? by indifferent+children · · Score: 1
      Still, you'd want to be careful with this, it would suck to back up all the temp files generated by random processes.

      In UNIX systems, all temp files usually reside in /tmp, which need not be on a RAID partition (unless you want those processes to stay up when you lose one of your drives).

      --
      Censorship is telling a man he can't have a steak just because a baby can't chew it. --Mark Twain
    2. Re:accidental deletion? by Umrick · · Score: 2, Interesting

      What I want is something like Plan9's Fossil+Venti file system. Versioned, with permenant copies offloaded to archive media. It's a rather nice, though not blazingly fast, complete view of data. Not the rather ephemeral view most of us take. Restore to any point in time since inception.

      Failing that, something like OpenAFS with mirrored globally addressable volumes that can work at the system level rather than user level. Sure you can use IP security for OpenAFS, and a few brave folks have even gotten network booting to OpenAFS working... Again, snapshots are an option.

      The world view of data should really be 1. Create 2. Version 3. Archive. Instead we have as-it-stands-now and arbitrary-backup-in-case-of-failure-or-user-stupid ity. Sure some people do put /etc under SVN/CVS control, but not many.

  8. Cool, but not new by BlankStare · · Score: 2, Informative

    This concept has been in play for years as a commercial product for Disaster Recovery, Veritas Volume Replicator (VVR).

    1. Re:Cool, but not new by just_another_sean · · Score: 1

      Yes but this is free. And it comes with source code. Yum.

      --
      Creationist Textbook Stickers Declared Unconstitutional by CowboyNeal
  9. Der Mouse? by slavemowgli · · Score: 1

    Those crazy Germans.

    --
    quidquid latine dictum sit altum videtur.
    1. Re:Der Mouse? by Red+Flayer · · Score: 2, Informative

      "Those crazy Germans."

      No, that would be "der Maus"

      You crazy Americans -- Hier ist der Maus!

      --
      "Trolls they were, but filled with the evil will of their master: a fell race..." -- J.R.R. Tolkien on Olog-hai
    2. Re:Der Mouse? by isecore · · Score: 2, Informative

      Those crazy Germans.

      According to the article, he's canadian.

      --
      I enjoy large posteriors and I cannot prevaricate.
    3. Re:Der Mouse? by Sponge+Bath · · Score: 1
      No, that would be "der Maus"

      Not to be confused with Die Fledermaus!
      Spoooon!

    4. Re:Der Mouse? by Ulrich+Hobelmann · · Score: 1

      Correct would be "hier ist die Maus", because mouse is feminine in German. "Die Sendung mit der Maus" means "show with the mouse", where "der" is dativ case.

      (don't mean to nitpick, just thought I'd correct it)

    5. Re:Der Mouse? by Red+Flayer · · Score: 1

      Sorry, I recall only a little German...

      He barely came up to my waist. :)

      Seriously though, it's been 15 years or so since I studied German, so thanks for pointing it out, it looked wrong to me, but I couldn't be bothered to look up the gender.

      --
      "Trolls they were, but filled with the evil will of their master: a fell race..." -- J.R.R. Tolkien on Olog-hai
  10. NBD? by mikeee · · Score: 3, Informative

    How does this compare with Linux Network Block Device? Sounds very similar.

    There are pretty mature commercial tools for this stuff, as well - Veritas' VVR replication comes to mind.

    1. Re:NBD? by tpgp · · Score: 3, Insightful

      How does this compare with Linux Network Block Device? Sounds very similar

      It doesn't compare at all.

      From my (quick) scan of the article - think of NBD as a replacement for NFS (well, sorta) & this as a sort of network RAID (kinda, not realtime).

      They're not really alike - for linux drbd is probably closer.

      --
      My pics.
    2. Re:NBD? by mikeee · · Score: 1

      Ah, that looks nice. I've heard of people running RAID-1 over NDB ('cause it's a block device!), but NBD apparently is a little flaky - there are a lot of kernel deadlock issues, and it doesn't sound like it ever quite picked up a userbase.

      Understandable; the people interested in remote replication mostly aren't interested in doing it with Alpha software. :)

    3. Re:NBD? by SmallSpot · · Score: 2, Informative

      Not the same as NBD, but it is very similar to DRBD (http://www.drbd.org/). I've used DRBD before, and it works quite nicely.

    4. Re:NBD? by Sentry21 · · Score: 1

      Repeat after me:

      RAID is not a backup solution!

      To answer the OP's question, it doesn't compare at all. NBD lets you export drives over the network so that they show up as block devices on remote systems (meaning you can do raw operations on them, use LVM, etc.); this, on the other hand, replicates changes to another filesystem.

      At first glance, this might not seem very effective for backup if deletes are replicated as well. That said, the benefit (as with replication in MySQL) seems (to me) to be that you can take a precise snapshot of the drive, however large it is, without having to take any services down, without slowing down the primary system, etc. Seems to me that that's probably the biggest benefit.

    5. Re:NBD? by rafa · · Score: 1

      We had it enabled where I work - but it took a while to get it tweaked right. In the beginning we got massive lag-spikes on our nfs-exported home dirs. It's a good idea, and I hope the problems with it can be ironed out.

      --
      [Science] is one of the very few things that raises human life a little above farce and gives it the grace of tragedy.
  11. Point In Time Recovery by h4ck7h3p14n37 · · Score: 1

    Shouldn't this technically be called a point in time recovery solution? When I think of a backup solution, I expect to be able to retrieve arbitrary files from an arbitrary point in time. Also, rather than mucking with the kernel, wouldn't it have been simpler to use the geom system?

  12. Reinventing the wheel? by mr_zorg · · Score: 1

    Isn't this guy reinventing the wheel? Why not just run a RAID 1 setup using iSCSI? Wouldn't that accomplish the same thing a lot easier?

    1. Re:Reinventing the wheel? by ROOK*CA · · Score: 1

      Possibly, but it would be a lot more EXPENSIVE as well, iSCSI HBA's + the iSCSI SAN device, not to mention what if you want to replicate your backups to multiple locations? then you're looking at replication agents on your iSCSI device.

    2. Re:Reinventing the wheel? by imemyself · · Score: 1

      You can do iSCSI pretty much entirely in software(I've done it with two VMWare VM's before) The only "special" hardware things that you would need to have are GbE NIC's that support jumbo frames as well as a switch that does. And you don't have to have those, though performance might not be quite as good. I don't know if there's iSCSI target/initiator software for NetBSD though.

      I'm sure having expensive HBA's can give you a lot better performance(as well as the ability to boot from an iSCSI drive), but just replicating stuff probably doesn't need really, really, top of the line $$$$$ hardware.

      --
      Every time you post an article on Slashdot, I kill a server. Think of the servers!
    3. Re:Reinventing the wheel? by yukonbob · · Score: 1

      I don't know if there's iSCSI target/initiator software for NetBSD though.

      A few days ago iSCSI target code and HOWTOs were submitted by Alistair Crooks... no initiator code yet. See here

      Looks like it's in pkgsrc (devel/netbsd-iscsi)

      -yb

  13. It All Makes Sense Except... by eno2001 · · Score: 1

    ...how do you get ALL the data on the backup server to start with? Pushing the writes off to the backup server in real-time is identical to what the HP VA7410 SAN I work with does internally in RAID 1+0 except that this happens over the network. But how are the disks in the backup server ever going to get all the original filesystem data if that data already exists AFTER you build your backup server? Even if you have a log of writes, you can't reconstruct the data. You'll only be able to reconstruct recent changes.

    --
    -"...bad old ideas look confusingly fresh when they are packaged as technology" - Jaron Lanier (Digital Maoism on Edge.o
  14. web-based by MikeFM · · Score: 0

    I've been doing this with a web-based system. Not as direct but works automatically when you connect to the site. Platform independant that way.

    --
    At what price learning? At what cost wisdom? The price is a man's peace of mind, and the cost is his life.
    1. Re:web-based by Thundersnatch · · Score: 1

      You built this yourself? How do you handle differential compression through a web browser? How do you compare file signatures? Handle permissions?

    2. Re:web-based by MikeFM · · Score: 1

      Java applets can do darn near anything.

      --
      At what price learning? At what cost wisdom? The price is a man's peace of mind, and the cost is his life.
    3. Re:web-based by Thundersnatch · · Score: 1

      Do you have a link to any code? Or references on the net? What you describe is a significant improvement over rsync/rdiff-backup for mobile users, and I'd like to know more.

    4. Re:web-based by MikeFM · · Score: 1

      I haven't yet decided if I'll opensource the system or not. I'm leaning towards opensourcing the client portion (the Java applet) but licensing the server software out. Or maybe I'll go with the new GPL if it protects my rights on software that is used as a service rather than distributed.

      I've done a few test runs which was enough to let me know I had to look into getting a multi-TB server farm before I could open it to the public. I've been trying to get investors for that (it costs around $250/mo per TB of server space) but looks like Google is going to kill my business plan.

      --
      At what price learning? At what cost wisdom? The price is a man's peace of mind, and the cost is his life.
  15. DRBD by Anonymous Coward · · Score: 1, Informative

    How is this any different from DRBD (http://www.drbd.org./

    From the website:

    DRBD is a block device which is designed to build high availability clusters. This is done by mirroring a whole block device via (a dedicated) network. You could see it as a network raid-1.

    Each device (DRBD provides more than one of these devices) has a state, which can be 'primary' or 'secondary'. On the node with the primary device the application is supposed to run and to access the device (/dev/drbdX; used to be /dev/nbX). Every write is sent to the local 'lower level block device' and to the node with the device in 'secondary' state. The secondary device simply writes the data to its lower level block device. Reads are always carried out locally.

    If the primary node fails, heartbeat is switching the secondary device into primary state and starts the application there. (If you are using it with a non-journaling FS this involves running fsck)

    If the failed node comes up again, it is a new secondary node and has to synchronise its content to the primary. This, of course, will happen whithout interruption of service in the background.

    And, of course, we only will resynchronize those parts of the device that actually have been changed. DRBD has always done intelligent resynchronization when possible. Starting with the DBRD-0.7 series, you can define an "active set" of a certain size. This makes it possible to have a total resync time of 1--3 min, regardless of device size (currently up to 4TB), even after a hard crash of an active node.

    1. Re:DRBD by Slashcrap · · Score: 1

      How is this any different from DRBD (http://www.drbd.org./

      Just to save anyone else having to reply - this is for BSD and therefore automatically better.

  16. delayed backups are still useful by bitspotter · · Score: 1

    In every case I've actually needed backups to date, I find that, if I did them instantly instead of nightly, I'd end up losing data. The most common need for a backup for me comes when I've made a mistake with the main data, and I need to go back to what I had, say, yesterday.

    This isn't to say that instant backups wouldn't be nice for failover architectures, though. I just don't deal with systems that large, yet.

    1. Re:delayed backups are still useful by ROOK*CA · · Score: 1

      Nothing that says you can't do delayed backups with this solution as well, replicate to your (near) real-time backup machine across the network, then tape back-up the replicated machine, this way you're never having to run backups (loading) against your production box and you've got a near-line image sitting on your replicated machine for quick restores.

  17. protection by NynexNinja · · Score: 2, Insightful

    How does this protect against an rm -rf against the filesystem... I guess it would trash the backup on the other side.

  18. RTFA by brunes69 · · Score: 1

    Every time the server is started, it sends a command to all the clients causing a full sync of all changes that occured while the server was offline. The same thing happens when a client is restarted, it sends a full sync to the backup server, any blocks that do not match the client checksum are re-sent.

    Thus the first time you ran this thing it would copy the whole disk image to the backup server. After that subsequent writes would be the only output.

    1. Re:RTFA by eno2001 · · Score: 1

      Err... thanks. I DID RTFA and I didn't see that section AT ALL.

      --
      -"...bad old ideas look confusingly fresh when they are packaged as technology" - Jaron Lanier (Digital Maoism on Edge.o
  19. Real-time accidental deletion, too. by evilviper · · Score: 3, Insightful

    This is basically RAID over the network. Personally, I can't see a lot of use for it... Just put the second drive in the machine, and use software RAID, rather than putting the second drive in a network server. Less network slowdown and congestion that way, not to mention CPU-time wasted packetizing, encrypting, etc.

    As always, RAID (and now this) is not a backup solution.

    --
    Slashdot gets worse every day... Pipedot: News for nerds, without the corporate slant
    1. Re:Real-time accidental deletion, too. by lheal · · Score: 1

      To belabor that point just a little bit, my personal observation is that for every hardware-based data loss event I've experienced, there have been 10 user-based events.

      Just today I had to recover the Inbox of a user who deleted a message but didn't know who sent it, when it arrived, or what the subject of it was. He also wasn't sure when he deleted it, so I had to do the restore twice.

      I keep a lot of data backed up on disk, rsynched once a day. Some data I even back up once an hour. It doesn't cost anything, but it's nice to say "sure, I've got a backup of that." They think I've got powers. I don't, of course, but I learned that back is part-and-parcel of my job.

      --
      Raise your children as if you were teaching them to raise your grandchildren, because you are.
  20. Invented by Larry Robertson in the mid 1980s by Anonymous Coward · · Score: 1, Interesting

    Larry Robertson came up with this concept in the mid 80s as I recall, implemented it for VMS way back then as a remote shadowing system. He told me about it in one of the Anaheim DECUS meetings back then, published it (his company being called Bear Software back then). While the idea was not patented, the idea of moving updates wide area and doing local journalling so that the "shadow" needed only to keep up with average write I/O rates, rather than peak write rates, was AFAIK new back then and he deserves credit for thinking of it and implementing it. If anyone should try to patent it, he could be contacted also to show prior invention and public description. (Another outfit that came out shortly later with something similar had, as I have been told, a copy of the Bear program in house. That suggests my belief is correct that Robertson came up with the idea first, and that duplicate invention did not occur there.)

    1. Re:Invented by Larry Robertson in the mid 1980s by MichaelSmith · · Score: 1

      Yes vms volume shadowing works really well, compared with the different techniques in use with *nix. I was involved with running the traffic management systems here in Victoria off vms clusters with various types of shadowing, both through decnet and scsi (dssi).

      Somebody told me a story about a vms system hit during the Oklahoma City bombing. The other half of the cluster was on the other side of the city.

  21. Needful things... by Anonymous Coward · · Score: 0

    Great for backing up the internet to your Iomega 80GB USB drive.

  22. but Multics had this in 1970 or so.... by apl73 · · Score: 2, Insightful

    It's almost a troll to even mention it, since there are so many things pioneered by Multics....

  23. oops! by krismon · · Score: 4, Funny

    Oh no! the rootkit got replicated to the backup server!

  24. Journaling? Use that... by Jherek+Carnelian · · Score: 1

    It seems to me that if you use a journalling filesystem that journals everything, not just meta-data, you can just send the journal logs off to your backup device. Presuming your backup device starts with the same baseline data (i.e. a full level-zero dump) Then you would have the ability to restore your files, or entire filesystem, to the state it was at any point in time just by playing back the journal logs. Presumably a "smart" replay algorithm could be implemented that would use some sort of regular snapshots and "colescing" (sp?) of the logs to speed up restores (reducing the time spent doing the journal replay).

    Such an approach would not require hooks into each device driver either, it would be entire at the filesystem level.

  25. Its his childish attempt to mock Theo. by Anonymous Coward · · Score: 0

    His real name is Mike Parker, and he wasn't good enough for OpenBSD. Since then, he has a problem with Theo for not accepting his crap. So he uses Der Mouse (Theo's last name is pronounced "de-rat") as his stupid alias.

    1. Re:Its his childish attempt to mock Theo. by Ryvar · · Score: 1

      Heh, when I was reading this the fortune that appeared at the bottom of the page - right underneath your comment - was:

      Passionate hatred can give meaning and purpose to an empty life. -- Eric Hoffer

      Cute coincidence.
      --Ryv

  26. Re:BSD Is Dying (over the net now) by bod1988 · · Score: 0

    Yay for you. Now go back to masturbating over pictures of tux and stop posting trolls with links that are 5 years old.

  27. Full rescan by countach · · Score: 1

    It needs to do a full rescan on reboot?

    UGH!

    That kills it for me.

    1. Re:Full rescan by Anonymous Coward · · Score: 0

      You reboot your servers? Loser.

  28. Linux posts have missed FUSE mirroring filesystems by FellowConspirator · · Score: 1

    I just wanted to point out that there are several FUSE-based filesystem implementations that do the same thing (functionally, not implementation-wise) and they do not require hooks in the device drivers -- they don't even care what the filesystem is for the original or the backup.

    And, yes, RAID is a very good solution if you've got the money and are smart enough to recognize when a disk fails...

  29. Re:BSD Is Dying (over the net now) by EvilMalware · · Score: 1

    This article makes me laugh...get a life and don't troll Slashdot, kthx.

    --
    Nerdspeak.net
  30. DragonFlyBSD has the better way... by KonoWatakushi · · Score: 2, Informative
    There is actually a better way, and it is being implemented in DragonFlyBSD. Instead of duplicating writes at the device level, VFS operations are logged to a journal descriptor, which may be a file or network pipe. As this is performed in a VFS layer, it is possible to use with any filesystem. However, it is not limited to remotely (or locally) mirroring a filesystem; with the journal available, it will also be possible to rewind the state of the filesystem to any point in time, subject to the journal size.

    So what you have is not only a real-time backup, but also the ability to unwind any possibly damage after a break-in or other event. If your backup server is only running a journalling client, it can be made extremely secure, and also provide an excellent auditing tool.

    It would also be possible to delay the streams and have arbitrarily old filesystems available. Or to use a local journal as a buffer to smooth out the IO load as it is piped off site. It could also be used to augment a non-journalling filesystem, for crash recovery purposes, assuming your filesystem provides at least some consistency guarantees. In fact Netapp does something similar by logging FS operations to NVRAM, while the filesystem only writes consistentcy points periodically.

    Although I haven't read about the NetBSD work, I am sceptical that they could get the error handling to work correctly at that level. With the DragonFly journalling, there is support for transactional consistency, as well as recovering from interrupted network connections. While it is not complete, much of it is in place and functional.

    See the mountctl manual page for attaching a journal to a mount, and the jscan manual page for processing the journal.

  31. Really bad Design by Anonymous Coward · · Score: 0

    Good lord, am I the only one who thinks this isn't cool and is possibly the worst implementation of a network backup system?

    Now it's possible I'm not seeing this the right way, but as I understand it, the entire backup process is embedded in the kernel. Modularity gets kinda difficult then, and sucks to be you if the backup process at the other end craps itself and pushes data into the IP link that causes it to crap itself. If a change needs to be made to the backup process, then you have to start messing around with the kernel. Not clever.

    How about this for theoretical backup process;
    backupd asks the kernel to notify it when a directory or file changes.
    upon notification, backupd forks a process (or adds file to the queue) and pushes the changed file down a pipe. Since we can chop and change the pipe at will, we can use ssh, or substitute our encryption/authentication methods at any time.

    Hey guess what, a normal user can run this on a standard kernel, or root can run the same process to do a backup of an entire directory, and it can be installed on any filesystem. Gosh.

  32. Automatically replicate screw-ups too? by mccrew · · Score: 1

    Wouldn't this also replicate deletes across to the offsite machine in near-real-time? So if one were to accidentally delete a file, or a $HOME directory, or a complete filesystem, then there would be no way to recover from this from the "backup" machines, because their files would have gotten nuked too?

    --
    Hey, Windows users, there is no such thing as "forward" slash, there is only slash and backslash.
    1. Re:Automatically replicate screw-ups too? by Miniluv · · Score: 1

      If you chose to implement it that way, then yes it would. Dunno about the NetBSD implementation, but real commercial ones know the difference between Cr U and D and handle each differently allowing for file versioning and deletion versioning in the backups.

  33. Not true... by Anonymous Coward · · Score: 0
    this acts as a poor-man's replacement for RAID-1 (mirroring), with the same problems inherent in that system that make it unsuitable for general backups. Consider a simple command - "rm -rf s *". Ooops!

    I think there is a great deal of misunderstanding about the mikerubel.org article; rsync/hard link is far better suited to backup than raid1 (which, as you suggest, isn't really a backup at all). For starters, rsync/hard-link backups live on a separate disk on a separate machine. To erase them, you have to gain access to that machine, which may be strongly locked-down--mine is accessible only from the console. Many point-in-time backups are created by the method, though storage is efficient, and they can be NFS exported back, read-only, to one or more clients. It's much easier to verify and restore than tape. If you're really uncomfortable with online backups, buy usb2 hard drives, use the same technique to make backups onto them. You can unplug the hard drives when you're done and place them in a vault. Or build two backup servers and make sure at least one is always powered-off.

  34. NetBSD? by RavenChild · · Score: 0

    This seems like they are really putting emphasis on this feature. Could it mean they might need it if NetBSD dies?

  35. Re:you have to give them credit by Anonymous Coward · · Score: 0

    1) Ghost is a piece of shit. I've never talked to anyone who has used it with success. It seems to always corrupt its own files. A simple tarball is better in a number of ways IMO.
    2) Ghost is not the same as this. RTFA. This is a change to the device drivers that replicates and records what has been done at a block level over an encrypted network connection to a backup server. This isn't really meant for the average user.

  36. Network Appliance by SignalX · · Score: 2, Informative

    NetApp has been doing somthing similar for a very long time. A lot of people use the Sun boxes on the frontend to boot or attach to the storage appliance and let it do the backups. It saves space and saves the server from having to do it.

  37. Fools, BSD is dea . . . by Maradine · · Score: 1

    oh, wait, what?

    --

    trustedworlds.net - gaming, security, and the gunk that lives in between

  38. Re:B.S. D? The elements. by Keaster · · Score: 1

    You are missing a few things like fire, flood, lightning etc.

  39. Linux too by einhverfr · · Score: 1

    with the Coda filesystem. Or am I missing something?

    --

    LedgerSMB: Open source Accounting/ERP
  40. Isn't this the same as... by nigham · · Score: 1

    rsync -avz ~/ user@remote:homebackup

    in crontab?

    --
    I don't want to read /. I want to go home and re-think my life.
  41. Silly OBSD Troll (Re:... mock Theo.) by kjs3 · · Score: 2, Informative

    Actually, he's been around as Der Mouse since I was in college (circa 1985). I ran the xterm replacement he wrote back then. Long, long before Theo had his hissy fit and forked off OBSD. Of course, a trivial Google would have shown that, but hey, an AC would want to miss out on an ad homen flame...

  42. Re:First Post by Anonymous Coward · · Score: 0

    Mod parent up, "Funny"!

  43. RAID1 as backup by dwater · · Score: 1

    Ever considered using RAID1 in a backup system? I've not tried it, but it isn't difficult to see how it can be implemented.

    Think of it in a similar way as a tape-based backup system...

    1) implement RAID1,
    2) have many spare disk drives (they're cheap now),

    when you want a snapshot backup :
    3) 'fail' one of the drives,
    4) remove it,
    5) install spare drive and add it to the RAID1 (it'll rebuild automatically),
    6) take 'failed' drive off site/lock it in safe/whatever you would do with a tape

    I'm not entirely sure how one would restore, but it should be fairly easy.

    --
    Max.
  44. Dealing with Loss by Anonymous Coward · · Score: 0
    Of course you mourn the demise of *BSD. It's only natural. Dealing with the death of an operating system close to you can be one of the most traumatic experiences of your life, and you're bound to go through a range of emotions. While you may be able to work through those feelings on your own, it's often helpful to talk to a friend, a family member, or a counselor. You might also seek out a support group for people who are grieving.

    Grieving is a process, and it's totally normal to go through feelings of shock, sadness, anger even guilt. The healing process is different for everyone. It might take you six weeks to move on, or it might take you six years. Don't beat yourself up because you're not "over it" yet. It takes time to heal wounds.

    So what else can you do to feel better? It might sound corny, but try writing a letter, making a collage, or planting a tree in memory of the operating system you've lost. Remembering and celebrating all the good things *BSD brought to your life might help give you some closure, and having a keepsake to honor *BSD may help you get through some tough times in the future when you'll be missing it.

    It's true that life won't be the same without *BSD around. It may seem like you'll never feel better, but eventually you will. Take some comfort in the old saying, "Time heals all wounds," and remember that *BSD will always be with you in your heart.

    1. Re:Dealing with Loss by ScrewMaster · · Score: 1

      Dealing with the death of an operating system close to you can be one of the most traumatic experiences of your life, and you're bound to go through a range of emotions. While you may be able to work through those feelings on your own, it's often helpful to talk to a friend, a family member, or a counselor. You might also seek out a support group for people who are grieving.

      Funny ... when Windows 9x died I felt nothing but an overwhelming feeling of relief, and a certain sense of vindictiveness.

      --
      The higher the technology, the sharper that two-edged sword.
  45. OT Post by Anonymous Coward · · Score: 0

    Why is the "Sections" missing BSD?

  46. Re:BSD Is Dying (over the net now) by Anonymous Coward · · Score: 0
    *BSD Sux0rs

    In a startling turn of events today, a previously little-known fact came into the public eye: "*BSD Sux0rs". This came as a complete surprise to the BUWLA, or BSD Users With Large Assholes, as they previously thought that *BSD 0wned.

    "You see, even though I have never contributed code to any BSD project, I thought it was my duty to be a big asshole to others which don't use the OS I do, because it just 0wnz.", said one FreeBSD user. "Now that I know it sux0rs, though, I have to go find something else to be an asshole about."

    One notorious OpenBSD fanatic known as WideOpen, told reporters, "I have to kill myself. This isn't how it was supposed to happen. My BSD has always been the best, and shouting that opinion in other people's faces at every chance I got has been my only hobby. It was all I ever did. It was what got me out of bed in the morning. Now I have to die. I will jam my bedpost up my ass until I hit my brain. It is the only way to go: BSD style."

    In the volatile world of operating systems anything can happen. "At least we don't sux0r as much as Windows users", BigAzz, a relatively well-known NetBSD user said. "Screaming things in people's faces is my calling. Now I need to scream that BSD sux0rs. What a sad world. At least I won't kill myself like those uber-asshole OpenBSD guys. They are just way over the top. Or were, at least."

    Nobody knows for sure what the future holds for the state of operating systems, but with Netcraft confirming the sux0r status, *BSD users all over the world will have to stick something else up their asses from now on or risk looking even more gay than they used to.

  47. Being able to rollback local filesystems by typical · · Score: 1

    If you aren't looking for network functionality, there's a filesystem called ext3cow that lets you roll back to older versions of the contents of the filesystem.

    --
    Any program relying on (nontrivial) preemptive multithreading will be buggy.
  48. Ughhhh by Anonymous Coward · · Score: 0

    RAID is not a backup solution.. arghhh! When will these people learn.

  49. Some Big Negatives by headLITE · · Score: 1
    - Essentially the same as DRBD, but build into the kernel (which is bad)
    - Essentially the same as Linux' md plus AoE (ATA over Ethernet), which is also built into the kernel, but more modular
    - Essentially the same as Linux' md plus Linux' nbd (network block device), which is also built into the kernel, but more modular
    - Built into the kernel when it could be a daemon that is notified of changes by the kernel instead (don't know if NetBSD's kernel does this, others do)
    - It's not a backup solution but a RAID: It only protects you from disk failures, not from brain failures, something that backup solutions can do

    I work at a research institute where we run a cluster of Linux workstations that boot from a server and keep all their data on a file server. Both servers are Linux boxes that use drbd to keep their configuration synchronized. The file server is a set of two front-end machines (one active, one waiting) that additionally use md+nbd to create a network RAID-1 over many back-end nodes. Each back-end node has a RAID-1 of local disks. This means that each disk in each back-end is redundant and each back-end is redundant, so that all data is stored four times instantly upon each write (yes, we do have a separate backup solution), distributed over two separate floors in different parts of the building. And this was implemented by one single totally underpaid student (granted, he's good...)!

    So tell me again what's new about TFA.

    1. Re:Some Big Negatives by Anonymous Coward · · Score: 0

      Did you even bother to read the article's title? It's news about NetBSD, not Linux. *bonks you on the head for being such a tool*

    2. Re:Some Big Negatives by headLITE · · Score: 1

      I read that the article is about NetBSD, but it is still just another badly designed re-implementation of a known concept. It is in no way big news; instead, it reminds me a lot of Windows getting 64 bit capabilities - something everything else and their teenage sister was already doing anyway.

  50. Question from a troll... by Anonymous Coward · · Score: 0

    Now I'm probably Slashdots most nieve reader but...

    Couldn't the same be done on any OS using iSCSI and software RAID?

    Apparently I'm missing something.

  51. Re:BSD Is Dying (over the net now) by Slashcrap · · Score: 1

    Yay for you. Now go back to masturbating over pictures of tux and stop posting trolls with links that are 5 years old.

    Now this is what bugs me. Why do you always assume that it's Linux users trolling you? As you have demonstrated, BSD zealots are so eminently trollable that literally anyone can have a go. Even an MCSE could probably generate tens of angry replies with a cut & pasted troll.

    It's quite likely that you are in fact trying to insult a script. Have some self respect for fuck's sake.

  52. Re:Brazil exports 2 things by Anonymous Coward · · Score: 1, Insightful

    until you can write a program in Lua to give me a BJ, speak for yourself