Filesystems with Transactions?

← Back to Stories (view on slashdot.org)

Filesystems with Transactions?

Posted by Cliff on Sunday October 14, 2001 @07:20AM from the cp-rm-mv-ln-rb-and-cm? dept.

Bryan Andersen asks: "I'm looking for a filesystem that I can rollback all the changes made by a user to a given date/time. Are there any for Linux or *BSD, or is my only option to go to one of the NAS vendors? I want this so I can more easily cleanup after users trash all the files they can access. Yes I know this would mean I'd have to have much larger partition sizes, but I feel with disk prices the way they are I can't go wrong doing this." I'm not aware of any filesystems that can specifically do this, and I'm not quite up on my JFS knowledge to know if any of those can be adapted to this task without code changes. It would seem like the easiest way to do this would be to mirror the drives at set times (your "commit") and then a "rollback" would be a simple matter of restoring from those images. Of course, there may be just such a file system in the works that I simply haven't heard about yet. Have you?

2 of 13 comments (clear)

Min score:

Reason:

Sort:

How about using Samba VFS? by mwaddell · 2001-10-14 20:44 · Score: 3, Interesting

I was thinking about doing something very similar to this. I haven't gotten around to implementing it yet, but what I want to do is to use the VFS feature of Samba to add CVS-like (possibly by interfacing with a local CVS server?) versioning control to certain directories of files.

--
"It is only with the heart that one can see rightly; what is essential is invisible to the eye." -Saint-Exupery
Some thoughts... by coyote-san · 2001-10-15 03:14 · Score: 5, Interesting

I looked into what would be required to implement this a while back. It's actually pretty straightforward, although the naive implementation will tend to grind the disk. (Using two spindles is a *very* good idea!)

To implement it, you need to create three subpartitions. The naive implementation has three distinct areas, better implementations would interleave them somehow.

The first subpartition contains the live filesystem, and it could be *any* filesystem. It really doesn't matter - like the loopback FS, this approach creates a new virtual device that only cares about individual blocks.

The second subpartition contains a circular buffer with the *previous* contents of each block as it is written.

The third subpartition contains an index, one entry for each block in the second partition. Again, it would be a circular buffer on the disk. (Indeed, for performance it should be interleaved with the cache, e.g., one index block followed by the 256 cache blocks it represents, repeating.) The index contains the block number and the time it was updated. Alternately, you could store just the last block number and maintain a separate list containing time stamps and "last index written."

Write access is straightfoward - immediately before you write any block you copy the existing block into the circular buffer, update the index, then write the new block. This is not much different from regular journaling systems.

Read access is a bit more complex. If you are "live," you always read the live FS. If you have rolled back the FS, you check the index for the first update after the time in question. If it exists, you return the cached block. Otherwise you return the block from the live FS. But in practice you will undoubtably explicitly mount each rolled back version of the FS. With a fixed time, you can create a bitmap of changed blocks and quickly load the appropriate block. The driver would have to update this bitmap if the 'live' FS is also mounted. With a "delayed realtime" mount (e.g., showing changes as they occured 12 hours ago) you would update the bitmap from the index prior to each read.

--
For every complex problem there is an answer that is clear, simple, and wrong. -- H L Mencken