Complete Filesystem Checkpointing?

← Back to Stories (view on slashdot.org)

Complete Filesystem Checkpointing?

Posted by Cliff on Sunday February 10, 2002 @08:00AM from the filesystems-with-a-tau-axis dept.

polymath69 asks: "Living on the edge of Debian unstable means that updates sometimes break stuff, occasionally to an extent that is difficult to recover from. This got me thinking about treating the entire set of mounted filesystems as a transactional database. Mark state, try something which might be dangerous, test, and approve (commit) or panic (rollback). Obviously some filesystem support would be required, but with ext3 and reiserfs available, maybe the potential is already there. And such a system would need lots of disk space, but these days that's a demand easily granted. There's lots out there on process-level checkpointing, and even some stuff about system-level checkpointing, but all I've found on that was in the context of saving and restoring processes for a system freeze and restore. But I couldn't find anything on Google or SourceForge about doing this sort of temporary branching in the filesystem. Is this idea feasible? Is anyone working on it?"

7 of 36 comments (clear)

Min score:

Reason:

Sort:

Chuq is working on it. by omega9 · 2002-02-10 08:20 · Score: 4, Informative

Well, looks like this guy Chuq is working on it. He seems to be a kernal hacker that works for VERITAS.

You can also find interesting filesystem info here

There's also work being done on TRAM (Transactional RAM).

--
I'm against picketing, but I don't know how to show it.
1. Re:Chuq is working on it. by Tet · 2002-02-11 01:07 · Score: 3, Informative
  
  Well, looks like this guy Chuq is working on it. He seems to be a kernal hacker that works for VERITAS.
  Of course, Veritas have their own FlashSnap product that does this for VxFS filesystems, and have just released it for Linux. It's a relatively pricey option, but it works well, and if you need this sort of functionality, the price is negligible.
  
  --
  "The invisible and the non-existent look very much alike." -- Delos B. McKown
plan9's n/dump is this by DrSkwid · 2002-02-10 08:41 · Score: 3, Informative

the default increment is daily

one can roll back the filesystem on a PER PROCESS basis with the yesterday command.

In this way you can narrow down what's broken by for instance using yesterday's c library, or last week's , or last years!

Also take a look at Venti

From: Sean Quinlan To: 9fans Mailing list

For those of you interested in the direction we are heading
with respect to plan 9's file system, you might want to
checkout our paper on Venti that will appear in the
USENIX fast conference.

http://www.cs.bell-labs.com/~seanq/pub.html#venti

Venti is a block level storage server that replaces the optical
juke box for a plan 9 file system. Some of the benefits include:
coalescing of duplicate blocks
compression
no block fragmentation
Also, we have switched from optical to magnetic disks as the storage
technology. I know many of you already use magnetic disks to
"fake" a worm, but for those of us using a optical juke box,
the performance improvement is rather substantial!!

seanq

--
There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
snapshotting.. by Anonymous Coward · 2002-02-10 10:41 · Score: 2, Informative

Snapshotting is what you really want for something like this. NetApp has had this functionality available in their Filer appliances for a number of years - you can cd into a 'magic' .snapshot directory where hourly, daily, weekly, and monthly snapshots are kept.

FreeBSD 5.0-CURRENT includes preliminary snapshot support for ffs.

The Linux options aren't quite as good. The most promising new filesystem that could provide this functionality is tux2, where data is structured in a way that would make implementing this functionality fairly easy. There was a post explaining how it would work in the mail archives, but they seem to have disappeared.

There is commercial option: MVD Snap. Their fileserver is Linux based, and the code for their snapfs filesystem was once available during beta testing.
1. Re:snapshotting.. by Guy+Harris · 2002-02-10 20:37 · Score: 3, Informative
  
  NetApp has had this functionality available in their Filer appliances for a number of years - you can cd into a 'magic' .snapshot directory where hourly, daily, weekly, and monthly snapshots are kept.
  
  In fact, we've had that since we first shipped our machines. There's a paper on our Web site that discuss how this works, File System Design for an NFS File Server Appliance.
  However, although snapshot directories let you dredge up copies of files from snapshots in case you (or a program) screws up and trashes them, that's not a convenient way to roll back the state of the entire file system.
  We did implement that later (atop the same mechanism); see SnapMirror and SnapRestore: Advances in Snapshot Technology - SnapRestore(TM)(R)(LSMFT) is the "roll back an entire file system to a snapshot" feature. (At times, all this SnapStuff makes me want to SnapTheNeckOfMarketing, but so it goes....) That paper doesn't discuss technical details to the extent that the other paper does, but it should be possible from the earlier paper to figure out at least some of how you'd do it.
Re:VMWare by dubl-u · 2002-02-10 17:52 · Score: 3, Informative

I love, love, love VMWare for this. It's ctrl-Z for sysadmins.

Lately I wanted to experiment with the various kernel-level security packages like LOMAC, LIDS, and SELinux. It was great to be able to build a default linux install on a virtual disk and then copy it three or four times to install the weird security stuff.

It's even better for non-Unix OSes. A friend wanted help installing his Java web app on NT. I built a variety of virtual machines for testing, all using the VMWare "Undoable disk" choice. So when some weird registry key got screwed up by an Oracle installer, I just picked "Undo" and tried again!

If you have to use crappy OS or packages that are inclined to break things and put crap everywhere, VMWare is a delight!

(Yep, I'm just a happy customer.)
AFS by William+Aoki · 2002-02-10 19:37 · Score: 4, Informative

AFS will do something like that, although not to the extent that I hear NetApp Filers can. Off the top of my head, there are two ways to do this with AFS. Both these methods require superuser access to your AFS cell, unless backups or replication releases are being done automatically.

(CodaFS should be able to do this too. I haven't played with CodaFS enough to know if it offers any other way to accomplish checkpointing.)

Method 1: backup volumes

$ cd /afs/mycell/some/path
$ kinit me/admin
Password for me/admin@MYCELL:
$ aklog
$ vos backup some.path.avol
$ kinit me
Password for me@MYCELL:
$ aklog
$ cd avol

do stuff with the filesystem...
Oops! I need files that I modified or deleted!

$ cd ..
$ fs mkm avol.backup some.path.avol.backup
$ cp avol.backup/little-lost-file avol/
$ fs rmm avol.backup

Many sites run 'vos backupsys' (generally before 'vos dump'ing volumes) every night to automatically back up all their volumes, and leave users' backup home volumes mounted under their home volumes, to provide easy access to yesterday's files without an administrator's help.

Method 2: for replicated volumes

$ cd /afs/.mycell/some/volume

do stuff - uh-oh, I need a file back that I changed!

$ cp /afs/mycell/some/volume/my/file my/file

ok, finished with the changes. Commit them!

$ kinit me/admin
Password for me/admin@MYCELL:
$ aklog
$ vos release some.volume
Released volume some.volume successfully
$ kinit me
Password for me@MYCELL:
$ aklog

Volume (for volume, read filesystem) backups work by saving the state of a volume at the time the backup command was issued. When changes are made to the volume, the original state is copied to the backup volume. The backup volume only takes as much space as the changes made since the last backup. Replication works by making read-only copies of a volume in one or more locations, as specified by 'vos addsite' commands. The copies are only updated when changes are 'released' from the read-write copy to the read-only copies. By convention, cell root volumes are mounted read-only on /afs/cellname and read-write on /afs/.cellname.

I think that newer versions of Solaris will do checkpointing on UFS. I haven't adminned Solaris since 2.3 (the slooow SS20 with 2.8 under my bed dosen't count until I play with it some more), so I'm not familiar with the details.