Filesystems with Transactions?
Bryan Andersen asks: "I'm looking for a filesystem that I can rollback all the changes made by a user to a given date/time. Are there any for Linux or *BSD, or is my only option to go to one of the NAS vendors? I want this so I can more easily cleanup after users trash all the files they can access. Yes I know this would mean I'd have to have much larger partition sizes, but I feel with disk prices the way they are I can't go wrong doing this." I'm not aware of any filesystems that can specifically do this, and I'm not quite up on my JFS knowledge to know if any of those can be adapted to this task without code changes. It would seem like the easiest way to do this would be to mirror the drives at set times (your "commit") and then a "rollback" would be a simple matter of restoring from those images. Of course, there may be just such a file system in the works that I simply haven't heard about yet. Have you?
Unless you make like a specific system backup point where the system is in a completely safe state you know you can return to, wouldn't you still have to be concerned with cascading roll backs?
Hmmm, as soon as you talked about rolling back "trashed" files I immediately began thinking about some sort of optimistic validation protocol where transactions would attempt to write to the same file and one would roll back (like based on time stamp) but then I caught you just want a restore point for users and I'm wondering why the overhead? Why not just a backup like to external tape, or, as you suggest, added internal HD locations for backup?
Ok, so maybe having a file system handle the restoration rather than you might seem easy, but how hard is backup software?
Wheeeee
What I did is have my computer backup every text file in my home directory (other than a list of patterns to exclude) using cvs, every 3 hours. This did not take up much space, because cvs only backs up the changes to a file.
Every time I did a full backup, I backed up everything including the CVS directory, and then emptied that directory.
It was really easy to set up, and I can dig up the script if anyone is interested.
I was thinking about doing something very similar to this. I haven't gotten around to implementing it yet, but what I want to do is to use the VFS feature of Samba to add CVS-like (possibly by interfacing with a local CVS server?) versioning control to certain directories of files.
"It is only with the heart that one can see rightly; what is essential is invisible to the eye." -Saint-Exupery
OpenVMS file versioning.
:-)
:-)
'nuff said.
And hey, you can make a KICKASS cluster of these! Forget that beowulf stuff... why not have a cluster that actually _does_ something?
--nbvb
It's called 'system restore' and it's a feature of Windows ME and possibly XP. It's a wonderful wonderful thing, and has saved me from quite a few driver conflicts and "supported" (note the quotes) hardware installs.
Vintage computer games and RPG books available. Email me if you're interested.
Network Appliance, vendor of Network Attached Storage, has (or at least had when I looked) a feature close to this. They use a proprietary filesystem on their NAS boxes, which allows them to do unusial things.
A place where this in quite clever is for stable, snapshot views of the filesystem for the backpup software to look at, while applications continue to use it.
Isn't ClearCase based on the versioned file systems in VMS? It's based on something like that.
We have a transactional system built on top of ClearCase where I work. It's OK but I'm sure there are neater solutions to the problem.
I looked into what would be required to implement this a while back. It's actually pretty straightforward, although the naive implementation will tend to grind the disk. (Using two spindles is a *very* good idea!)
To implement it, you need to create three subpartitions. The naive implementation has three distinct areas, better implementations would interleave them somehow.
The first subpartition contains the live filesystem, and it could be *any* filesystem. It really doesn't matter - like the loopback FS, this approach creates a new virtual device that only cares about individual blocks.
The second subpartition contains a circular buffer with the *previous* contents of each block as it is written.
The third subpartition contains an index, one entry for each block in the second partition. Again, it would be a circular buffer on the disk. (Indeed, for performance it should be interleaved with the cache, e.g., one index block followed by the 256 cache blocks it represents, repeating.) The index contains the block number and the time it was updated. Alternately, you could store just the last block number and maintain a separate list containing time stamps and "last index written."
Write access is straightfoward - immediately before you write any block you copy the existing block into the circular buffer, update the index, then write the new block. This is not much different from regular journaling systems.
Read access is a bit more complex. If you are "live," you always read the live FS. If you have rolled back the FS, you check the index for the first update after the time in question. If it exists, you return the cached block. Otherwise you return the block from the live FS. But in practice you will undoubtably explicitly mount each rolled back version of the FS. With a fixed time, you can create a bitmap of changed blocks and quickly load the appropriate block. The driver would have to update this bitmap if the 'live' FS is also mounted. With a "delayed realtime" mount (e.g., showing changes as they occured 12 hours ago) you would update the bitmap from the index prior to each read.
For every complex problem there is an answer that is clear, simple, and wrong. -- H L Mencken
Commercially, you have Files-11 ODS-2 for OpenVMS (which is really neat if you've never used it), Netware also has a salvage queue for Traditional FS as well as their journaling file system (whose name escapes me at the moment).
You could also look into Oracle Intermedia, which lets you store files of any type in a transaction database.
Otherwise, you could kludge something with CVS/cron/tar. I've seen some interesting things done during the day with software mirrors splitting for backups (OpenVMS volume shadowing or Solaris Disk Suite), but you risk hosing open files at that point.
Best of luck.
"All I ever wanted was to see Larry Wall give Bill Gates a Perl necklace."
http://www.eisenschmidt.org/jweisen
I'm surprised nobody's yet mentioned union mounts, at least available in OpenBSD and FreeBSD.
The classical use for a union filesystem is to make a CD-ROM appear to be read-write. You mount the CD and then mount another partion on top of it with the union option. Any changes are made to the union-mounted partition.
The underlying filesystem doesn't have to be a CD-ROM, of course. Your problem could be quite easily solved with three disk partitions: two large enough to hold everything, and one large enough to hold the changes.
Start by mounting one of the large partitions and then union mounting the smaller one on top of it. If you need to roll back, simply unmount and newfs the union partition. When you want to commit, assume that wd1c and wd2c are your large partitions and wd3c is your small partition and do something like:
As an added bonus, the union-mounted filesystem can be mounted normally later and you only see the modified files.
Of course, if you're working with really large filesystems and time is critical, this is likely to be too slow for you.
b&
All but God can prove this sentence true.