NetBSD - Live Network Backup
dvl writes "It is possible but inconvenient to manually clone a hard disk
drive remotely, using dd and netcat. der Mouse, a Montreal-based NetBSD
developer, has developed tools
that allow for automated, remote partition-level cloning to occur
automatically on an opportunistic basis. A high-level description of the system has been posted at KernelTrap. This facility can be used to
maintain complete duplicates of remote client laptop drives to a server
system. This network mirroring facility will be presented at BSDCAN 2005 in Ottawa, ON on May 13-15."
Why not just use rsync and ssh ?
I'm not up on my xBSD's, so can someone explain how hard this would be to port to the Mac? This would be perfect for cloning my son's Mac Mini.
It's much less network and hardware intensitive and with the right parameters, will keep past revisions of every changed file. Your hard disks will live longer.
This would be an extremely sensitive server system. With everyones harddrive image just waiting to be blasted to a blank harddrive, the potential for misdeeds is staggering. Even in an offical capacity, I really feel uneasy if my boss was able to take a copy of my harddrive image and see what I've been working on. Admittely, yes it should all be work but here we are allowed a certain amount of freedom with our laptops and I wouldn't want to have that data at my bosses fingertips.
On the flipside, this would be a boon to company network admins especially with employees at remote sites who have a hard crash.
Another reason to build a high speed backbone. Getting my 80GB harddrive image from Seattle, while I'm in Norfolk would be a lot of downtime.
-Teiresias
...when you get that idiot (and EVERY company has at least 1 of these guys) who calls you up asking if it's OK to defrag their hard-drive after downloading a virus or installing spyware. Then, when you tell them "NO", they just tell you that they did it anyways.
Now we can just hit a button and restore everything, a few thousand miles away.
The only thing left is to write code to block stupid people from reproducing.
IGB: More fun than eating oatmeal!
...512 byte blocks as a lowest common denominator unit of exchange between client and server. At each client to server connection, the application identifies and maps changes to disk block states. Changed blocks are then encrypted and sent to the server. This indicates that a user could open his or her laptop in an airport, establish a WiFi link to an open access point, and remotely update their laptop backup without effort, knowledge or even good intentions.
What happens if you try to update while running heavy disk writes? Try to back up your swap?
Assuming you can get around bandwidth monitoring, how long before this becomes incorporated into hacking tools. Add this to a little spyware and a zombie network and things get very interesting for poorly secured networks & computers.
Comment removed based on user account deletion
I've been using der Mouse to copy files for years. First I user der Mouse to click on the file, then I use der Mouse to drag it to a new location!
Doesn't NetBSD support dump -L the way FreeBSD does? This strikes me as a much more powerful and general solution than this custom tool...
I hereby place the above post in the public domain.
Maybe setup is inconvenient. Remote backups using dd and ssh (our method) was a bit of a bear to initially setup, but thanks to shell scripting and cron and key agents, it hasn't given us any problems. I've seen a few guides with pretty straightforward and mostly universal instructions for this type of thing. That being said, I do hope this software will at least get people to start looking seriously at this type of backup since it lets you store a copy off-site.
NFS will eventually bite you in the ass if successful writes are assumed by the client. Without digging through the code, can someone address WRT the article referenced 'stuff'.
lived there?
muahahahahahahahahahaha
stop that's too much
It's so cool, Rush is almost from there!
If one tries to clone an FS that is active, can this cloning tool handle open/changin files (often the most important/recent-in-use files on the system)? I remember an odd bug in an Mac OS X cloning tool that would create massive/expanding copies of large files that were mid-download during a cloning.
Two wrongs don't make a right, but three lefts do.
Isn't there an automated network disk backup tool for paranoids like me?
Well, I'm not really paranoid, but I had some cases where faulty file system drivers or bad RAM modules changed the content of some of my files and where I have then overwritten my backup with these bad files.
Isn't there any automatic backup solution that avoids such a thing? What I have in mind: there should be several autonomous instances of backup servers (which may actually reside on desktop PCs linked via LAN) that control each other on a regular basis. They should also keep back old versions of files as far as disk space allows.
Then, there should be a KDE tray applet showing me the state of the backup server network. It would indicate if servers haven't been cross-checked for some time or if CRC errors or general malfunction problems have occurred.
Wouldn't that be nice? Never ever care again for your backups. It's all done in the background and in a total paranoid manner.
Wel, not a solution for BSD people (unless you're running a bsd under Xen and the toplevel linux kernel is doing the DRBD).
I, too, immediately thought of German when I saw "der Mouse" (although in German it would be "die Maus", since Maus is feminine). Since they're located in Montreal, however, it seems unlikely that they'd be inclined to use German, and would be more likely to go for a French reference. So I ask, where does the "der" come from?
Ben Hocking
Need a professional organizer?
While this is cool, as I thought when I saw it on KernelTrap, disk mirroring is useful in situations where the hardware is less reliable than the transaction. If you have e.g., an application-level way to back out of a write (an "undo" feature), then disk mirroring is your huckleberry.
/.Recycler, whatever). Obviously the underlying routine would have to do its own garhage collection, deleting trash files by some FIFO or largest-older-first algorithm.
Most (all) of my quick restore needs result from users deleting or overwriting files - the hardware is more reliable than the transaction. I do have on-disk backups of the most important stuff, but sometimes they surprise me.
I'd like a system library that would modify the rename(2), truncate(2), unlink(2), and write(2) calls to move the deleted stuff to some private directory (/.Trash,
Just a thought.
sigs, as if you care.
Novell Zenworks has had this capability for sometime in production environments. It also integrates with their management tools so it is easy to use on an entire network. To say this technology is newly discovered is a far cry from the truth. They also use Linux on the back end of the client to move the data to the server.
It is nice though to have something like this in the open source world though. Competition is good.
SIGS!!!We don't need no stinkin sigs
Assuming you can get around bandwidth monitoring, how long before rsync becomes incorporated into hacking tools. Add it to a little spyware and a zombie network and things get very interesting for poorly secured networks & computers.
Maybe I should patent this. Ah well, I figure if I mention it now it should prevent someone else from doing so...
:)
I was thinking - I know how Ghost supports multicasting and such. I was thinking about how to take that to the next level. Something like Ghost meets BitTorrent.
Wouldn't it be great to be able to image a drive, use multicast to get the data to as many machines as possible, but then use BitTorrent to get pieces to any machines that weren't able to listen to the multicast (ie it's on another subnet or something) and to pick up any pieces that were missed in the broadcast, or get the rest of the disk image if that particular machine joined in the session a little late and missed the first part?
I think that would really rock if someone wanted to image hundreds of machines quickly and reliably.
I'm thinking it'd be pretty cool to have that server set up, and find a way to cram the client onto a floppy or some sort of custom Knoppix. Find server, choose image, and now you're part of both the multicast AND the torrent. That should take care of error checking too, I guess.
Anybody care to take thus further and/or shoot down the idea?
You can accomplish anything you set your mind to. The impossible just takes a little longer.
I've used Linux for years to do this using md running RAID1 over a network block device. It works very well unless you have to do a resync. Is this better than that?
I'm asking because I'm backing-up about a dozen servers in real-time using this method, and if this method is more efficient, then I might be able to drop my bandwidth usage and save money.
I have done that 12 years ago on AIX with no problems as long as (a) the hd you dd it off from and to are sound and (b) there are no transmission failures beyond what rsh (at that time) would retry and mask.
not the same?
http://www.feyrer.de/g4u/
And since we're running OpenBSD on those machines, porting this should be fairly straightforward... although now that I look at it, he adds some patches for sockets... eugh...
How about disk cloning across servers, for on-demand scalability? As a single server reaches some operating limit, like monthly bandwidth quota, disk capacity, CPU load, etc, a watchdog process clones its disks to a fresh new server. The accumulating data partition may be omitted. A final script downs the old server's TCP/IP interface, and ups the new one with the old IP# (/etc/hostname has already been cloned over). It's like forking the whole server. A little more hacking could clone servers to handle load spikes (not just filling total capacity), running simultaneously under DNS load balancing scheme, like simple round-robin host/IP resolution. And cloning across a WAN could offer geographical distribution for disaster preemption. Is this stuff close to being a .deb package yet?
--
make install -not war
I'm sure you were referring to Ghost, which is great stuff, however, I would hardly consider that "Windows" technology, considering that you can clone Linux systems as well.
You also fail to realise this can be done *live*, while the system still runs, where Ghost can not.
Sorry. As an IT guy I routinely peruse people's harddrives looking for interesting material. I use Windows scripting host to search everyone's drives for mp3's wma'a avi's and mpg's.
It isn't your laptop. You have noe freedom to do anything with it.
Does anyone know if this, or any other product for that matter, can be used for making images on itanium machines?
From der Swedish Chef.
Bork, bork, bork.
Assume I was drunk when I posted this.
Why on earth are people always so insistent on doing raw-level dupes of disks?
...decades!. It works great, and its trivial to pump over ssh:
/dev/rdsk/c0t0d0s0 | (cd /newdisk && ufsrestore f -)
/dev/rdsk/c0t0d0s0 | ssh user@machine 'cd /newdisk && ufsrestore 0f -' .. it even supports incremental dumps (see: "dump level"), which is the main reason to use it over tar (tar can to incremental with find . -newer X | tar -cf filename -T -, but it won't handle deletes).
First of all, it means backing up a 40GB with 2 GB of data may actually take 40GB of bandwidth.
Second of all, it means the disk geometries have to be compatible.
Then, I have to wonder if there will be any wackiness with things like journals if you're only restoring a data drive and the kernel versions are different...
I have been using ufsdump / ufsrestore on UNIX for
# ssh user@machine ufsdump 0f -
or
# ufsdump 0f -
So -- WHY are you people so keen on bit-level dumps? Forensics? That doesn't seem to be what the folks above are commenting on.
Is it just that open source UNIX derivative and clones don't have dump/restore utilities?
Do daemons dream of electric sleep()?
UUOC
Image backups certainly have their place for people who can understand their limitations. However, a good, automatic, versioning file backup is almost certainly a higher priority for most computer users. And under some circumstances, they might also want to go with RAID for home computers.
This tool should be better in the case where you are more interested in backups. RAID1 also insures data integrety when doing reads.
The facility today supports symmetric cryptography, based on a shared secret. The secret is established out-of-band of the network mirror facility today. User identification, authentication and session encryption are all based on leveraging the pre-established shared secret.
----------- Confucious say: "The shared secret is no longer a secret."
He who knows best knows how little he knows. - Thomas Jefferson
... facts are facts. ;)
FreeBSD:
FreeBSD, Stealth-Growth Open Source Project (Jun 2004)
"FreeBSD has dramatically increased its market penetration over the last year."
Nearly 2.5 Million Active Sites running FreeBSD (Jun 2004)
"[FreeBSD] has secured a strong foothold with the hosting community and continues to grow, gaining over a million hostnames and half a million active sites since July 2003."
What's New in the FreeBSD Network Stack (Sep 2004)
"FreeBSD can now route 1Mpps on a 2.8GHz Xeon whilst Linux can't do much more than 100kpps."
NetBSD:
NetBSD, for When Portability and Stability Matter (Oct 2004)
NetBSD sets Internet2 Land Speed World Record (May 2004)
NetBSD again sets Internet2 Land Speed World Record (Sep 2004)
OpenBSD:
OpenBSD Widens Its Scope (Nov 2004)
Review: OpenBSD 3.6 shows steady improvement (Nov 2004)
OpenSSH (OpenBSD subproject) has become a de facto Internet standard.
*BSD in general:
..and last but not least, we have the cutest mascot as well - undisputedly. ;)
Deep study: The world's safest computing environment (Nov 2004)
"The world's safest and most secure 24/7 online computing environment - operating system plus applications - is proving to be the Open Source platform of BSD (Berkeley Software Distribution) and the Mac OS X based on Darwin."
BSD Success Stories (O'Reilly, 2004) (pdf) ~ from Onlamp BSD DevCenter
"The BSDs - FreeBSD, OpenBSD, NetBSD, Darwin, and others - have earned a reputation for stability, security, performance, and ease of administration."
--
Being able to read *other people's* source code is a nice thing, not a 'fundamental freedom'.
rsync is not scalable to large numbers of files. We set up a backuppc machine awhile ago, tried to rsync the entire backup set over to another machine... It was a miserable failure. Even if we didn't check for hardlinks, (which we have to, backuppc uses tons of hardlinks,) the rsync process completely saturated a gig of RAM before it even started syncing.
Now, rsync would have been fine if we'd unmounted the filesystem and done it on the raw partition. But there's a couple of problems with that:
It's not live. Not a big deal for us, since it's a backup machine to begin with, but still...
rsync doesn't do that. A couple of people have submitted patches to allow a flag for rsync to copy block devices as if they were files. They were tiny patches, but they were rejected out of a fear of users doing stupid things with them. I guess the usual Rsync Way is to duplicate the filesystem, so that devices are copied with mknod, not dd.
Don't thank God, thank a doctor!
rsync doesn't scale to huge numbers of files. It also doesn't work so well when all of those are changing at once. Finally, the protocol and algorithms may work for imaging an entire disk as if it was a file, but the program doesn't -- it can ONLY copy device nodes as device nodes, and will NEVER read a block device as a normal file. There have been patches to fix this, which have been rejected.
We use a scheme which actually seems better for systems which are always on: DRBD for Linux. Basically, every block written to a device on the master is automagically duplicated to all the slaves. If the master goes down, you promote one of the slaves to master, mount the partition, and start services. If you have the heartbeat package, this can be done automatically, complete with an ip takeover.
We aren't using it for high availability, actually. We just use it to duplicate a BackupPC partition out to someone's house, over openvpn. It's much nicer than rsync -- rsync was filling up a couple of gigs of RAM before it sent a single file, and in every instance, it was still eating up more swap when we killed it out of frustration.
The high availability design does help, though. If the entire office gets nuked, we can physically carry the backup box in, turn it on, make it master, and use BackupPC's native restore feature. Sometime soon we're going to make our PHB cream his jeans by demonstrating a full, bare-metal restore.
Don't thank God, thank a doctor!
dd if=/dev/rdsk/rwd0 | gzip | ssh user@remotehost '/usr/local/bin/gunzip - | dd of=/dev/rdsk/wd1'
BSD is alot older than 10 years. It's probably 20+ years old.
And what about Ghost for You. this does netbackup with onlye one 1.44" disk.
http://www.michel.eti.br
It makes it alot easier to find a file, cause it exists in the same location, uncompressed.
The huge advantage though, is that rsync only transfers those files that have changed. Which means that backups are very quick.
I also mount samba shares on the backup server, and do rsync backups of "My Documents" folders for the windows boxes. Works great there too!
Even better, the My Documents folders are available as (read only) Samba shares on the backup box, and the users can find their own files in the backups.
I have been doing this for years, and it works great!
What would be more perfect is simply being a competant admin in the first place, and not letting your users have permissions to fuck everything up. Nevermind that this is for NetBSD, which doesn't have a whole lot of viruses, nor a defrag program.
In fact, I can't think of any way that could possibly be worse than what you are doing now. Running a RAID1 over a network block device is horribly innefficient, and slow as all hell. This just backs things up when you want to, not all the time constantly with every trivial change like a network mirror does.
Yeah, I am really inclined to trust software written by someone who's afraid to use his real name, and uses a psudenym based on his jealousy of another free software developer who actually does use his real name. Does NetBSD really allow anonymous developers like this?
given your ignorance of the topic, i question if you even have a job. employers have the responsibility of knowing what their employees are doing. they are liable by law for the conduct of their employees. if some dumbass is dl'ing shit they shouldn't, somebody needs to find out, that is part of my job.
Actually, the history section of GNU/Linux Application Programming says the first version of BSD came out in 1976. That makes 20+ quite accurate (or 30- even more so).
Actually there is seed entropy stored on the disk. Check out the man page for random. It's used to seed the random number generator at boot time, as the usual system chaos generators are just getting going.
Presumably this could cause a vulnerablity around boot time. Say the machine establishes a VPN at boot, and the backup of the seed had just been intercepted before the boot, you might be able to tap the VPN.
haha you wasted a mod point - and now - waste another one.