Slashdot Mirror


Ext3cow Versioning File System Released For 2.6

Zachary Peterson writes "Ext3cow, an open-source versioning file system based on ext3, has been released for the 2.6 Linux kernel. Ext3cow allows users to view their file system as it appeared at any point in time through a natural, time-shifting interface. This is can be very useful for revision control, intrusion detection, preventing data loss, and meeting the requirements of data retention legislation. See the link for kernel patches and details."

6 of 241 comments (clear)

  1. Re:So which is it? by Bob54321 · · Score: 4, Informative

    From the example screenshot it appears it is a file system. You take a snapshot of your system at some point in time and it stores this data even when files change. Of course, with any file system it is important to have functionality that allows you to view the files as well...

    --
    :(){ :|:& };:
  2. The C in CVS. by SharpFang · · Score: 4, Informative

    Concurrent...

    Sure you can "go back in time", but two users working on the same file at the same time would be a pain. Networking would require additional layers - even plain SAMBA/NFS, but still. Plus a bunch of userspace utilities as UI to access it easily.

    It's not bad as a backend for such a system, just like MySQL is good as a backend for a website, but by itself it's pretty much worthless.

    --
    45 5F E1 04 22 CA 29 C4 93 3F 95 05 2B 79 2A B2
  3. Re:Overhead? by JoeD · · Score: 3, Informative

    Check the "Publications" link. The first one is an article in "ACM Transactions on Storage".

    It's a bit dry, but there is an explanation of how it stores the versions, plus some performance benchmarks.

  4. some background by pikine · · Score: 4, Informative

    I'm answering questions that people posted so far altogether.

    Is it a file system or a file manager?

    It is a file system. You access old snapshot by appending '@timestamp' to your file name. You have to first instruct ext3cow to take a snapshot first before you can retrieve old copies, otherwise it simply behaves like ext3. It appears that snapshot is always performed on a directory and applies to all inodes (files and subdirectories) under it.

    My complaint is its use of '@' to access snapshot. Why not use '?' and make it look like a url query? Better yet, use a special prefix '.snapshot/' like NetApp file servers.

    Does it store many copies of each file? or only the differences between the old and the new version?

    How far off is it to use these filesystems as a revision control system replacement?

    ext3cow takes it's name from "copy on write," and it does this on the block level. When you modify a file, it appears to the file system that you're modifying a block of e.g. 4096 bytes. COW preserves the old block while constructing a new file using the blocks you modified plus the blocks you didn't modify.

    You can think about it as block-level version control. However, when you save a file, most programs simply write a whole new file (I'm only aware of mailbox programs that try to append or modify in-place). Block-level copy on write is unlikely to buy you anything in practical use.

    Does it provide undelete?

    Only when you remember to make a snapshot of your whole directory. An hourly cron-job would do, maybe. There is always the possibility you delete a file before a snapshot is made.

    --
    I once had a signature.
  5. Re:Overhead? by anilg · · Score: 3, Informative

    COW has been present for a long time in ZFS since Solaris 10. The overhead there is negligible and its quite stable. Last I heard, it was being ported to FUSE on linux. Upcoming in the next releases of FreeBSD and OSX. Wiki has a pretty good article.

    --
    http://dilemma.gulecha.org - My philospohical short film.
  6. Re:True undelete by xenocide2 · · Score: 3, Informative

    There's a couple reasons for it not being in the kernel. First, it misleads users who expect some degree of data security. The good news is that sort of person likely follows kernel patches to the FS and would likely be aware of the problem, possibly even writing a script that replaces rm with a real-rm.

    The second argument is that it's better handled in user space, so the OS doesn't have to make that sort of policy. There's no reason you can't just alias rm to some .Trash, or configure your Desktop Environment to do so (GNOME does, for example). There's all sorts of things you have to decide that might not suit everyone. For example, if I delete a file on a USB drive, does it go in a .Trash storage in the USB drive, or do we copy it over to a main .Trash folder? Many people don't realize they have to empty the trash to reclaim space on their thumbdrive in GNOME.

    The final argument I can come up with is security problems. We can't have one global .Trash bin in a multiuser system. And quotas. And permissions.

    Reading historic archives of the LKML suggests it's at least come up once. I guess Torvald's opinion is that anything that CAN go in the userspace SHOULD. Can't explain the webserver in kernel though. Perhaps that opinion has changed some time in the last 10 years?

    --
    I Browse at +4 Flamebait

    Open Source Sysadmin