Slashdot Mirror


Best Shrinkable ReiserFS Replacement?

paulkoan writes "I have been using ReiserFS for my file system across a few servers for some time now (follow the link below for details of my experience). I can't foresee the future of ReiserFS, but if I'm going to have to migrate as support diminishes, I'd like to begin that process now. My criteria are: in-kernel support, shrinkable, and has good recovery when the file system is not closed properly. That shrinkable requirement precludes a lot of options. What's a good replacement for ReiserFS?"
I initially chose ReiserFS because I was building a MythTV system and it was the recommended FS across the board, from small to large files. I've had good experiences with ReiserFS and it has had a pummeling. That MythTV box for example has a very volatile environment and loses power on a regular basis. I haven't lost any data through any of these outages.

Compare this to my brief foray into XFS on the same box, where 25% of the filesystem ended up in lost+found with numbers for filenames. When this happened a second time on a different system I decided XFS wasn't for me — and I really don't get the point of a journalled filesystem that will keep data relatively safe, but then remove any means to identify it when things go wrong.

But everyone has good and bad experiences with filesystems, ReiserFS included. XFS has a good rep, my experience aside.

26 of 508 comments (clear)

  1. Some general thoughts by TheMidnight · · Score: 4, Informative

    I've heard good things about ZFS from Sun Microsystems, though I don't have much experience with it. Ext3 seems to have decent crash recovery though it requires fscks almost every time. JFS2 from IBM is the most solid filesystem I've ever seen, but I don't know if such a filesystem works with MythTV.

    1. Re:Some general thoughts by Anonymous Coward · · Score: 5, Informative

      JFS2 from IBM is the most solid filesystem I've ever seen, but I don't know if such a filesystem works with MythTV.

      JFS2 works perfectly with MythTV.

      I use JFS exclusively for my MythTV store, because it's the hands-down winner for deletion of large files (something that happens frequently with a MythTV box.)

      Note that JFS doesn't support shrinking, so it's not an option for the submitter.

  2. shrinkable? by gardyloo · · Score: 5, Informative

    My fastest way of checking what operations can be supported on filesystems at the present is by checking what gparted can do. Of the filesystems it works with right now, only four (jfs, reiser4, ufs, xfs) can't be shrunk using gparted.

  3. I can only speak for myself by UnknowingFool · · Score: 4, Informative

    For my MythTV installation, I choose ext3 for the system partitions like / and /usr and xfs for my /video partition. My system partitions are on a RAID 1 while my /video partition is a 1TB RAID 10 LVM. ext3 is more than adequate for my purposes and it does a decent job of recovery. Earlier this year my server started crashing intermittently with no messages in the error logs. I finally traced it to a bad stick of RAM and ext3 recovered in most of the cases. In one case I had to repair mysql databases, but that was the only hiccup.

    --
    Well, there's spam egg sausage and spam, that's not got much spam in it.
  4. Ext3? by Conception · · Score: 4, Informative

    Ext3 with LVM seems to be the popular way to go about this. Unless you really want an esoteric solution, from your requirements I don't see a reason to stray from the norm.

  5. ReiserFS is the data-killer by KiloByte · · Score: 5, Informative

    Ugh, ReiserFS and "good recovery when the file system is not closed properly"? It doesn't even have good recovery after a proper shutdown.

    When other filesystems die, the damage is localised. When Reiser fucks up, all or nearly all of the tree is lost. Usually, you'll lose all files bigger than 4KB, although other damage modes are possible.

    Reiser has a codebase of an insane size. A relatively small piece of code can be mostly bug-free, Reiser is simply too large, complex and ill-tested. I admit, I haven't given it a try recently but you can guess why I hate the very idea of approaching it without a ten-foot pole.

    I've seen XFS screw a number of random files, ext3 mangled only files that were being written to, and my personal favourite is JFS. Even though I use JFS most of the time, the only screwup I witnessed was on a RAID without a write-intent bitmap.

    --
    The creatures outside looked from Alt-Right to Antifa; but already it was impossible to say which was which.
  6. Re:How about - by AndrewNeo · · Score: 5, Informative

    Despite the parent trying to be funny, NTFS does support shrinking. I've used it to shrink a full disk partition down a bit to install a Linux one on the side.

    (Now queue 'no room left for Windows on the drive' jokes)

  7. Stay Put by m6ack · · Score: 5, Informative

    ReiserFS is still being used and maintained in-kernel. It's Stable, and it just works for you and for hundreds of thousands of others; so, what's the rush?

    I'd wait for the next batch of next gen FS (BTRFS, Tux3) to show their stuff -- and perhaps take a look at getting involved. Daniel Phillips has recently sent out a call for help... Sounds like you have an itch -- go scratch it.

    1. Re:Stay Put by gbjbaanb · · Score: 3, Informative

      link for the lazy, and a description of the FS.

  8. MythTV? by Awptimus+Prime · · Score: 3, Informative

    That MythTV box for example has a very volatile environment and loses power on a regular basis. I haven't lost any data through any of these outages.

    Okay, you need to consider a couple of things. First off, this is MythTV. Your concept of "large files" and the normal industry use of "large files" are entirely two different things. I really doubt you are going to exceed any limitations of a modern filesystem with porn, dvds, and television recordings.

    Second, you aren't going to lose data from a power outage when it comes to archived data you are reading (divx file, for example) when the power goes out. But no file system using system memory for a cache is going to play well when abruptly having the power yanked while it's writing.

    Third, just use ext3. It's one of the most used, reliable, and proven file systems to date. If it's not enough, you are better off using a UPS and software raid5 an array a few similar sized drives, with a ext3 file system.

    Let's please filter further headlines where people are asking about what exotic filesystem they should be trying out for non-raid applications. PLEASE.

    1. Re:MythTV? by GrumpyOldMan · · Score: 4, Informative

      He's concerned about "large files" because ext3 takes eons (10 to 20 seconds) to delete large (8GB/hr) files generated by recording HDTV. This used to be important on MythTV, because deletions were synchronous. So using ext2 in combination with HDTV on MythTV meant a 10 to 20 second "freeze" when manually deleting something, or missing 10-20 seconds of a new recording while an auto-expire deleted an old show.

      In newer versions of MythTV, deletions are done by a separate thread, so there should be no concerns about using ext2/ext3.

  9. ext3 with data journaling by davidwr · · Score: 4, Informative

    Performance may crawl to a standstill but ext3 with full journaling of data not just meta-data should make crash-recovery nearly bulletproof.

    Another option is to reduce the number of crashes:
    Make sure your software and hardware are stable and use a good, stable battery-backed power supply.

    The latter is good advice for any system.

    --
    Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
    1. Re:ext3 with data journaling by dermoth666 · · Score: 5, Informative

      The general problem with journaling filesystems recovery is not the data not being written (although in some very specific applications it can be required) as most serious apps like databases just fsync what they need on-disk. Problems arise when you have unprotected write cache.

      This can happen on SCSI/SAS RAID cards when you force the write cache without a battery, but the most general cause is cheap hardware, especially IDE/SATA disks. For performance reasons they usually have the write cache enabled by default, and in many disks (possibly not many SATA's but this was common on IDE) you can't even disable the write cache (hdparm -W0).

      With this kind of configuration, no matter what you do in term of journaling, you will *always* loose data when power fails during I/O operations.

      On a side note, if you need data journaling you should probably use an external journal on a separate disk/array. This way the journal device will be doing synchronous writes which is much faster on standard disks.

  10. ext2/3 can be shrunk offline by davidwr · · Score: 4, Informative

    I'm not sure if gparted can do it yet, but you can shrink and grow ext2/3 partitions at the command line using a combination of tools.

    --
    Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
  11. Re:Is Linux a hard requirement? by Just+Some+Guy · · Score: 3, Informative

    That's highly dependent on how many filesystems you have, and across how many drives. I got by just fine with AMD64/2GB on a 750GB SATA drive and maybe 20 filesystems.

    --
    Dewey, what part of this looks like authorities should be involved?
  12. Re:FAT32 by raijinsetsu · · Score: 3, Informative

    There's absolutely no disaster recovery on FAT32. It has no protections from bit errors, and has no native method of defining permissions.
    It's used on thumb drives because A) it has very little meta data that needs to be written to the drive in addition to the data (meaning: you can unplug faster), and B) it works on every OS.

  13. Re:To expand on that by GrumpyOldMan · · Score: 4, Informative

    ZFS isn't available on Linux.

    ZFS is available on Linux, via Fuse. This gives a heavy performance penalty over a native implementation(*), but it would probably be fast enough for MythTV. However, ZFS is not shrinkable, so it doesn't meet the original poster's requirements.

    (*)For a raidZ 3-disk array of WD "green" 750GB Sata drives (WD7500AACS-00ZJB0), I see 80MB/s sequential write, and 144MB/s sequential read for a native ZFS implementation on FreeBSD/amd64 7.0. For the same setup, I saw 25MB/s write and 95MB/s read from ZFS via fuse.

  14. On-topic:... by BrokenHalo · · Score: 5, Informative

    If something rock-solid is needed, one could do worse than continue to use ReiserFS3. (This is what I use.) It's feature-complete, and very stable. I have not had one mishap with it since I implemented it years ago.

    But if you want something more bleeding-edge, one could try Reiser4 (development of which I think has stagnated) or btrfs, which seems to implement the main design considerations of Reiser4, but has jagged edges waiting to be cleaned up.

    If something stable and under current maintenance is required, a conservative suggestion is of course Ext3.

    1. Re:On-topic:... by arth1 · · Score: 4, Informative

      reiserfs isn't feature complete, unless you mean "features that Hans Reiser wanted, but screw the rest". You can't use it for SELinux (without some ugly and known buggy patches), because it lacks compatible extended metadata facilities. NTFS compatible streams won't work either. There's no defragmentation possibility. And perhaps most of all, it has no dump/restore facility.

      I keep wondering why the OP wants the ability to shrink a file system. Could it be because he's thinking in ReiserFS terms, where there is no dump/restore, and thus is used to using shrink for that job? For file systems with dump/restore, one normally does a dump, recreates the FS in the desired size, then a restore. That has the advantage of the resulting file system being tuned to the new size, and unlike a regular backup/restore, will preserve any metadata, allocated extents, ACLs, sparse files, and everything else.

      Personally, I see a lack of dump/restore facilities as a much more serious handicap than lack of shrinking. Especially if you think forward, and consider that you're much more likely to replace drives with faster and bigger drives than you are to shrink them.

      I suggest XFS, and let xfsdump/xfsrestore do the job of shrink/grow.

  15. Re:To expand on that by tzot · · Score: 4, Informative

    XFS or JFS might be perfectly good solutions

    One should never, ever use XFS on a non-UPS-protected system. It's a great filesystem, but if you don't get the time for a sync of the in-memory structures, you're screwed.

    --
    I speak England very best
  16. Re:Difficult question by lewiscr · · Score: 3, Informative

    VxFS includes a kernel module. You can't boot off it (no grub support), and it's installed after installation, so it can't be your root FS. It can be any other mount point. I generally use it for my MySQL and PostgreSQL data partitions. I would use it for /home if I had to deal with users.

    VxFS by itself doesn't support all of those features (moving from stripe to concat, changing stripe width etc). Some of those come from VxVM (Veritas Volume Manager), which is well enough integrated with VxFS that I can resize a logical volume and filesystem with a single command.

    VxFS is the only FS that I've used that can be resized while mounted. Actually, it must be resized while mounted. I've expanded and shrunk filesystems many times while MySQL was under load. It increases the disk I/O a bit, so MySQL runs a bit slower, but otherwise there was no impact.

    Not only that, I've had a machine reboot (my fault) in the middle of a complex operation (restrip the RAID0 portions of the RAID 0+1 array in preparation to convert to a RAID 1+0). VxVM and VxFS mounted the volume fine, MySQL started serving, then VxVM picked up where it left off and completed successfully. No data lost.

    In addition, a dirty 100G+ volume takes about 15 seconds to fsck. Suck that ext3.

    On any server that can wake me up in the middle of the night, I'll gladly pay for the Veritas Foundation Suite.

  17. You chose ReiserFS for MythTV? by sportster · · Score: 3, Informative

    Did you read the documentation? From http://www.mythtv.org/docs/mythtv-HOWTO-3.html#ss3.1
    Filesystems
    MythTV creates large files, many in excess of 4GB. You must use a 64 or 128 bit filesystem. These will allow you to create large files. Filesystems known to have problems with large files are FAT (all versions), and ReiserFS (versions 3 and 4). Because MythTV creates very large files, a filesystem that does well at deleting large files is important. Numerous benchmarks show that XFS and JFS do very well at this task. You are strongly encouraged to consider one of these for your MythTV filesystem. JFS is the absolute best at deletion, so you may want to try it if XFS gives you problems. MythTV .21 incorporates a "slow delete" feature, which progressively shrinks the file rather than attempting to delete it all at once, so if you're more comfortable with a filesystem such as ext3 (whose delete performance for large files isn't that good) you may use it rather than one of the known-good high-performance file systems. There are other ramifications to using XFS and JFS - neither offer the opportunity to shrink a filesystem; they may only be expanded. NOTE: You must not use ReiserFS v3 for your recordings. You will get corrupted recordings if you do.

  18. Re:To expand on that by NekoXP · · Score: 3, Informative

    ZFS isn't available on Linux

    Bollocks.

    ZFS-FUSE works fine. If you can build a kernel with an initrd which loads FUSE, ZFS-FUSE and mounts the root filesystem, you have absolutely no troubles whatsoever and absolutely acceptable performance for a MythTV box and a couple of servers. And if you managed to set up MythTV over ReiserFS then this isn't going to be a problem for you at all.

    The fact that it's in userspace is not a barrier to entry and nor is it "not available" just because it's not a kernel module.

  19. Re:Difficult question by amRadioHed · · Score: 4, Informative

    VxFS is the only FS that I've used that can be resized while mounted

    What about ext3?

    --
    We hope your rules and wisdom choke you / Now we are one in everlasting peace
  20. Re:JFS can't shrink and doesn't need to (on AIX) by Wodin · · Score: 3, Informative

    JFS2 on AIX can shrink and it's silly to say it doesn't need to.

    http://www.ibm.com/developerworks/wikis/display/Wikip5/Lesson+2+-+AIX+5L+Features+and+Benefits

    The JFS2 file system shrink function supports optimizing storage utilization by removing unused disk space from the file system environment. Administrators can dynamically add and delete disk space as needed to manage both the JFS2 and LVM environments in place, without the need to copy and reboot.

    http://publib.boulder.ibm.com/infocenter/pseries/v5r3/index.jsp?topic=/com.ibm.aix.cmds/doc/aixcmds1/chfs.htm

    To reduce the size of the /test JFS2 file system, enter:

    chfs -a size=-16M /test

    --
    -- Wodin
  21. Re:dump/restore is useless by Antique+Geekmeister · · Score: 3, Informative

    Sync your data, and do an LVM snapshot, and dump *that*.