Best Shrinkable ReiserFS Replacement?
paulkoan writes "I have been using ReiserFS for my file system across a few servers for some time now (follow the link below for details of my experience). I can't foresee the future of ReiserFS, but if I'm going to have to migrate as support diminishes, I'd like to begin that process now. My criteria are: in-kernel support, shrinkable, and has good recovery when the file system is not closed properly. That shrinkable requirement precludes a lot of options. What's a good replacement for ReiserFS?"
I initially chose ReiserFS because I was building a MythTV system and it was the recommended FS across the board, from small to large files. I've had good experiences with ReiserFS and it has had a pummeling. That MythTV box for example has a very volatile environment and loses power on a regular basis. I haven't lost any data through any of these outages.
Compare this to my brief foray into XFS on the same box, where 25% of the filesystem ended up in lost+found with numbers for filenames. When this happened a second time on a different system I decided XFS wasn't for me — and I really don't get the point of a journalled filesystem that will keep data relatively safe, but then remove any means to identify it when things go wrong.
But everyone has good and bad experiences with filesystems, ReiserFS included. XFS has a good rep, my experience aside.
I initially chose ReiserFS because I was building a MythTV system and it was the recommended FS across the board, from small to large files. I've had good experiences with ReiserFS and it has had a pummeling. That MythTV box for example has a very volatile environment and loses power on a regular basis. I haven't lost any data through any of these outages.
Compare this to my brief foray into XFS on the same box, where 25% of the filesystem ended up in lost+found with numbers for filenames. When this happened a second time on a different system I decided XFS wasn't for me — and I really don't get the point of a journalled filesystem that will keep data relatively safe, but then remove any means to identify it when things go wrong.
But everyone has good and bad experiences with filesystems, ReiserFS included. XFS has a good rep, my experience aside.
I've heard good things about ZFS from Sun Microsystems, though I don't have much experience with it. Ext3 seems to have decent crash recovery though it requires fscks almost every time. JFS2 from IBM is the most solid filesystem I've ever seen, but I don't know if such a filesystem works with MythTV.
My fastest way of checking what operations can be supported on filesystems at the present is by checking what gparted can do. Of the filesystems it works with right now, only four (jfs, reiser4, ufs, xfs) can't be shrunk using gparted.
For my MythTV installation, I choose ext3 for the system partitions like / and /usr and xfs for my /video partition. My system partitions are on a RAID 1 while my /video partition is a 1TB RAID 10 LVM. ext3 is more than adequate for my purposes and it does a decent job of recovery. Earlier this year my server started crashing intermittently with no messages in the error logs. I finally traced it to a bad stick of RAM and ext3 recovered in most of the cases. In one case I had to repair mysql databases, but that was the only hiccup.
Well, there's spam egg sausage and spam, that's not got much spam in it.
Ext3 with LVM seems to be the popular way to go about this. Unless you really want an esoteric solution, from your requirements I don't see a reason to stray from the norm.
jfs2 can be shrinked on AIX, not sure of its support on linux
Ugh, ReiserFS and "good recovery when the file system is not closed properly"? It doesn't even have good recovery after a proper shutdown.
When other filesystems die, the damage is localised. When Reiser fucks up, all or nearly all of the tree is lost. Usually, you'll lose all files bigger than 4KB, although other damage modes are possible.
Reiser has a codebase of an insane size. A relatively small piece of code can be mostly bug-free, Reiser is simply too large, complex and ill-tested. I admit, I haven't given it a try recently but you can guess why I hate the very idea of approaching it without a ten-foot pole.
I've seen XFS screw a number of random files, ext3 mangled only files that were being written to, and my personal favourite is JFS. Even though I use JFS most of the time, the only screwup I witnessed was on a RAID without a write-intent bitmap.
The creatures outside looked from Alt-Right to Antifa; but already it was impossible to say which was which.
Despite the parent trying to be funny, NTFS does support shrinking. I've used it to shrink a full disk partition down a bit to install a Linux one on the side.
(Now queue 'no room left for Windows on the drive' jokes)
ReiserFS is still being used and maintained in-kernel. It's Stable, and it just works for you and for hundreds of thousands of others; so, what's the rush?
I'd wait for the next batch of next gen FS (BTRFS, Tux3) to show their stuff -- and perhaps take a look at getting involved. Daniel Phillips has recently sent out a call for help... Sounds like you have an itch -- go scratch it.
That MythTV box for example has a very volatile environment and loses power on a regular basis. I haven't lost any data through any of these outages.
Okay, you need to consider a couple of things. First off, this is MythTV. Your concept of "large files" and the normal industry use of "large files" are entirely two different things. I really doubt you are going to exceed any limitations of a modern filesystem with porn, dvds, and television recordings.
Second, you aren't going to lose data from a power outage when it comes to archived data you are reading (divx file, for example) when the power goes out. But no file system using system memory for a cache is going to play well when abruptly having the power yanked while it's writing.
Third, just use ext3. It's one of the most used, reliable, and proven file systems to date. If it's not enough, you are better off using a UPS and software raid5 an array a few similar sized drives, with a ext3 file system.
Let's please filter further headlines where people are asking about what exotic filesystem they should be trying out for non-raid applications. PLEASE.
Performance may crawl to a standstill but ext3 with full journaling of data not just meta-data should make crash-recovery nearly bulletproof.
Another option is to reduce the number of crashes:
Make sure your software and hardware are stable and use a good, stable battery-backed power supply.
The latter is good advice for any system.
Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
You will need at least 8G of RAM. ZFS is an enterprise file system, which needs big hardware. So run 64-bit FreeBSD and get lots of memory.
Because it isn't.
I'm not sure if gparted can do it yet, but you can shrink and grow ext2/3 partitions at the command line using a combination of tools.
Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
2: You're proposing a reactive method of systems administration. This might be fine for a hobbyist who doesn't care about his system(s), but for a production environment this is playing with fire. You know that support for ReiserFS will disappear (unless you know for a fact that another person/group has stepped up to provide support); why wait until the last possible second, when you'll only have more work to do, to migrate your systems to a new filesystem? Don't put off to tomorrow that which can be done today.
Exactly. Start looking at alternatives right now. Put together a test environment to see how the new filesystem holds up to abuse. Schedule some downtime and start moving the data over with minimal disruption to your clients.
Then sit back and relax knowing that you won't be scrambling for a new filesystem at the last minute later on.
That's highly dependent on how many filesystems you have, and across how many drives. I got by just fine with AMD64/2GB on a 750GB SATA drive and maybe 20 filesystems.
Dewey, what part of this looks like authorities should be involved?
Get a look at this (if nobody else alredy posted it):
http://linuxmafia.com/faq/Filesystems/reiserfs.html
There's absolutely no disaster recovery on FAT32. It has no protections from bit errors, and has no native method of defining permissions.
It's used on thumb drives because A) it has very little meta data that needs to be written to the drive in addition to the data (meaning: you can unplug faster), and B) it works on every OS.
I use WinXP (w. NTFS) for a PVR app. It works ... BUT I have a serious problem with fragmentation. Very noticeable during video playback. I added a scheduled task to defrag once a week (along w. a weekly reboot). I also need to make sure that I never fill the drive too full.
[Insert pithy quote here]
ZFS isn't available on Linux.
ZFS is available on Linux, via Fuse. This gives a heavy performance penalty over a native implementation(*), but it would probably be fast enough for MythTV. However, ZFS is not shrinkable, so it doesn't meet the original poster's requirements.
(*)For a raidZ 3-disk array of WD "green" 750GB Sata drives (WD7500AACS-00ZJB0), I see 80MB/s sequential write, and 144MB/s sequential read for a native ZFS implementation on FreeBSD/amd64 7.0. For the same setup, I saw 25MB/s write and 95MB/s read from ZFS via fuse.
I have migrated most of my ReiserFS partitions to either XFS or EXT3. For something like MythTV XFS is probably the better choice since it excels at large files. My experience with Reiser is that it tended to suck for large files, especially writes. I also love the XFS tools, like being able to defragment a mounted filesystem and xfsdump.
EXT3 has also made huge strides, especially with the directory hashing feature. I do not like how long fsck takes after so many mounts, though, or for recovery.
Also, regardless of filesystem, set the noatime and nodiratime parameters in fstab to see another big performance boost.
This post is encrypted twice with ROT-13. Documenting or attempting to crack this encryption is illegal.
First, I don't understand the need for a shrinkable filesystem at all (I've only ever grown filesystems in my time as a systems administrator, and then it was just easier to move the whole thing to another drive rather than mess about - that's a rule that's held since the dark days of DOS and 20Mb harddrives, although there was a program called FIPS that could do amazing things with partitions for the time). I've never seen partitions or drives that ever needed to get smaller and the only thing that indicates is that you can't afford a larger hard drive and you've hit capacity and you don't want to delete those Windows games...
Second, if you're getting lost+found files on anything journalled, it's because you've not got "full" journalling switched on, you've not got the latest kernel, or you've hit an unusual bug. The first is most likely because you're probably running on a "middle-ground" option, like ext3 also has by default, which says "favour speed over safety". The reason for this will become clear the instant you run a "full" journalling system. It's incredibly slow to write, because everything gets written twice effectively. The "slow deletion using ext3" on MythTV things are a thing of the past - a thread does it in the background now and you never know it's happening.
Third, I don't see why the filesystem is that critical for, of all things, a MythTV box. It's hardly vital stuff we're talking about here. If you are THAT worried, you'd have a UPS on the thing and backups, or net-backup to a proper storage PC. You're obviously not. Thus, use whatever's available and if and when you decide on a replacment filesystem because something a) isn't supported, b) isn't suitable or c) disappears from the Linux kernel, then you can... shock, horror, copy the data to a new partition with a new filesystem on it then.
Fourth, if you are really that geeky that you can't have Reiser now because it's no longer fashionable (which is what it sounds like more than anything else, and you've come up with the "shrinkable" thing to try to bolster that position), then why not have a RAID (battery backed if you don't want to lose data, remember!). Or why not put DATA seperate to OS in different partitions, have a read-only OS partition (it's MythTV, you could boot it from a CD) and then the worst that will happen is you will lose the current-written file on the Data partition(which might be that program you wanted to record, but better than trashing the system).
If something rock-solid is needed, one could do worse than continue to use ReiserFS3. (This is what I use.) It's feature-complete, and very stable. I have not had one mishap with it since I implemented it years ago.
But if you want something more bleeding-edge, one could try Reiser4 (development of which I think has stagnated) or btrfs, which seems to implement the main design considerations of Reiser4, but has jagged edges waiting to be cleaned up.
If something stable and under current maintenance is required, a conservative suggestion is of course Ext3.
That's just not true. I have two 320GB hard drives in a ZFS mirror, with no less than 64 filesystems, and "only" 1GB of RAM. I had a slightly smaller non-mirrored array for a long time on a weaker machine (32-bit, 512MB RAM) with no problems also.
This is under FreeBSD.
-:sigma.SB
WARN
THERE IS ANOTHER SYSTEM
XFS or JFS might be perfectly good solutions
One should never, ever use XFS on a non-UPS-protected system. It's a great filesystem, but if you don't get the time for a sync of the in-memory structures, you're screwed.
I speak England very best
VxFS includes a kernel module. You can't boot off it (no grub support), and it's installed after installation, so it can't be your root FS. It can be any other mount point. I generally use it for my MySQL and PostgreSQL data partitions. I would use it for /home if I had to deal with users.
VxFS by itself doesn't support all of those features (moving from stripe to concat, changing stripe width etc). Some of those come from VxVM (Veritas Volume Manager), which is well enough integrated with VxFS that I can resize a logical volume and filesystem with a single command.
VxFS is the only FS that I've used that can be resized while mounted. Actually, it must be resized while mounted. I've expanded and shrunk filesystems many times while MySQL was under load. It increases the disk I/O a bit, so MySQL runs a bit slower, but otherwise there was no impact.
Not only that, I've had a machine reboot (my fault) in the middle of a complex operation (restrip the RAID0 portions of the RAID 0+1 array in preparation to convert to a RAID 1+0). VxVM and VxFS mounted the volume fine, MySQL started serving, then VxVM picked up where it left off and completed successfully. No data lost.
In addition, a dirty 100G+ volume takes about 15 seconds to fsck. Suck that ext3.
On any server that can wake me up in the middle of the night, I'll gladly pay for the Veritas Foundation Suite.
Did you read the documentation? From http://www.mythtv.org/docs/mythtv-HOWTO-3.html#ss3.1 .21 incorporates a "slow delete" feature, which progressively shrinks the file rather than attempting to delete it all at once, so if you're more comfortable with a filesystem such as ext3 (whose delete performance for large files isn't that good) you may use it rather than one of the known-good high-performance file systems. There are other ramifications to using XFS and JFS - neither offer the opportunity to shrink a filesystem; they may only be expanded.
NOTE: You must not use ReiserFS v3 for your recordings. You will get corrupted recordings if you do.
Filesystems
MythTV creates large files, many in excess of 4GB. You must use a 64 or 128 bit filesystem. These will allow you to create large files. Filesystems known to have problems with large files are FAT (all versions), and ReiserFS (versions 3 and 4). Because MythTV creates very large files, a filesystem that does well at deleting large files is important. Numerous benchmarks show that XFS and JFS do very well at this task. You are strongly encouraged to consider one of these for your MythTV filesystem. JFS is the absolute best at deletion, so you may want to try it if XFS gives you problems. MythTV
The critical factor being CPU overhead. Fuse based file systems are nice and you can solve a lot of problems with them, and they certainly can exhibit good throughput but the situation with an HTPC is that generally you have limited CPU resources and you definitely have significant CPU demands in most cases.
The best case scenario is you have hardware MPEG decoding and all you're doing is watching a stream that is already on disk. In that case you're probably fine with most anything, and even antique machines will usually work fine.
But commonly you might see something like an HTPC running a fairly low power CPU where you're pulling data off some source and meanwhile watching something else, and frequently decoding the incoming stream, transcoding it and at the same time decoding what you're watching (which may be HD content for that matter). In those cases every CPU cycle counts and the overhead of a user land file system is going to burn you.
The upshot being ext3 still ends up in general being the best available recommendation.
"Malo periculosam, libertatem quam quietam servitutem." -- Jefferson
Bollocks.
ZFS-FUSE works fine. If you can build a kernel with an initrd which loads FUSE, ZFS-FUSE and mounts the root filesystem, you have absolutely no troubles whatsoever and absolutely acceptable performance for a MythTV box and a couple of servers. And if you managed to set up MythTV over ReiserFS then this isn't going to be a problem for you at all.
The fact that it's in userspace is not a barrier to entry and nor is it "not available" just because it's not a kernel module.
What about ext3?
We hope your rules and wisdom choke you / Now we are one in everlasting peace
Ext3 is shrinkable, just not when the file system is mounted. One can, however, grow the file system even while it's mounted.
Please see man resize2fs
Well, the semantics of shrinking are a little odd with ZFS. I just want to clarify for anyone else reading here...
ZFS usually doesn't do partitions at all. It can, for the sake of interop with other filesystems. Generally what you do is set up a drive (or partition) as a ZFS Pool. The pool is just space for storage. You can connect drives together into the same pool for aggregation, mirroring, or some combinations thereof (replacing RAID, with its RAID-Z reliability having better reliability than hardware). Setting up raid on ZFS is mind-bogglingingly simple. I bought one drive & set it up on ZFS, then the second a month (i.e. paycheck) later. Adding the second drive to the pool was a single line command, without ever needing to take the first drive down (well, reboot for adding the hardware, but that's a SATA issue only, SAS and ZFS won't care).
ZFS filesystems are created in pools. They take up as much space as they need, the rest is left free for other filesystems in the pool. You can create/backup/restore/delete filesystems with single-line commands.
Care about electronic freedom? Consider donating to the EFF!
but for the type of application discussed here it is probably overkill in terms of management complexity, etc.
In ZFS, here is how you format a disk device called /dev/ad10, mount it to /storage, and have it automatically mount itself on startup:
In linux here's how you format a disk called /dev/sdb, mount it to /storage, and have it automatically mount itself on startup:
On my FreeBSD box ZFS is probably the easiest and most intuitive set of commands to use. In addition to that, I also find it much easier to troubleshoot failing hardware. In other systems like linux, I would not know that my hardware is failing until I go back to read the data. With ZFS I can just schedule a scrub or run the scrub every week or so and check the status of my data integrity even while the file system is online and in use.
JFS2 on AIX can shrink and it's silly to say it doesn't need to.
http://www.ibm.com/developerworks/wikis/display/Wikip5/Lesson+2+-+AIX+5L+Features+and+Benefits
http://publib.boulder.ibm.com/infocenter/pseries/v5r3/index.jsp?topic=/com.ibm.aix.cmds/doc/aixcmds1/chfs.htm
-- Wodin
Sucks you got modded down but I'll back you up. As someone who manages a security research lab of 80 machines running XFS I've never seen a single XFS problem. The one time we had a corrupted XFS system it turned out to be the SCSI controller. XFS doesn't support shrinking a filesystem but it does support growing the filesystem in place while mounted. XFS is also the only linux filesystem to have a native defragger (xfs_fsr).
Ext3 on the other hand is not robust. Under heavy IO it simply corrupts its state and panics. Try putting 40 reader/writer processes on a 1TB filesystem doing 20MB/s random IO for a month. I guarantee Ext3 will kernel panic before the month is up and trash the filesystem in the process. Ext3 is also the slowest linux filesystem to boot.
ReiserFS has an incomplete toolset to check and fix the filesystem after an unclean shutdown. As such it can't be considered for serious use.
JFS from IBM looks good feature wise but the performance tanks when doing parallel IO. If you have an application that supports high concurrency doing disk operations stay away from JFS.
Sync your data, and do an LVM snapshot, and dump *that*.
Why would you want to take it offline?
You create the new file system, then freeze the original and pipe the xfsdump to xfsrestore, then remount the partition using the new source.
If you can't even freeze a partition every now and then, you probably don't have consistent backups either, eh? And you call that "production"?
xfsdump is pretty darn good for backups, by the way, supporting incremental backups as well as marking both individual files and folders for skipping, if you so desire.