Slashdot Mirror


A Good Filesystem for Storing Large Binaries?

jZnat asks: "I own hundreds of gigabytes of binary data, usually backed up from other mediums such as CDs and DVDs. However, I cannot figure out which filesystem would be best for storing all this reliably. What I'm looking for is a WORM-optimized FS that also has good journaling methods to prevent data loss due to some natural disaster while data is being shifted around. Trying something new for once, I tried using SGI's XFS due to its promising details, but I was met with countless IO errors after trying to write large amounts of data to it. I feel that Ext3 is not optimal for this; ReiserFS is too slow when it comes to reading large data files; and Reiser4 isn't mature enough to entrust my digital assets to. What filesystem would be most appropriate for these needs?"

7 of 214 comments (clear)

  1. JFS by member57 · · Score: 4, Informative

    I use JFS on RAID 5, no errors, uptime of 200+ days currently. Handling large files 200-300MB each all day long. Excellent performance.

    --
    If Kerry was the answer, it must have been a stupid question.
    The UN - The largest "political" cause of death.
  2. Comparison of File Systems by NuclearDog · · Score: 5, Informative

    Comparison of FileSystems (from Wikipedia)

    Personally, I run two 300GB drives in RAID1 on UFS and am quite satisfied with it, but you seem to be incredibly, incredibly picky, so I'm sure you could find something wrong with it ;P

    ND

    --
    This statement is forty-five characters long.
  3. I/O Errors??? by Stephen+Samuel · · Score: 4, Informative
    If you're getting lots of I/O errors with XFS, I'd be inclined to look at a hardware problem (unless the I/O errors consist of attempts to read past the end of the partition -- which could be caused by you manually specifying the partition size, rather than letting mkfs.xfs figure it out).

    Like someone else said -- try using badblocks(8) -- or just use dd to make sure you can read the entire partition without errors.
    Bad disks do happen -- even new ones. Production code in Linux is generally very stable, and (unlike with windows), you can usually start with the presumption that things like I/O errors are caused by real hardware problems of some sort (even if it's just bad/loose cables).

    --
    Free Software: Like love, it grows best when given away.
  4. Re: ext3 works fine, did you try it? by Matt+Perry · · Score: 4, Informative
    I feel that Ext3 is not optimal for this
    Did you try it with ext3? I have 688G in a RAID5 array spread across four 250GB drives. I use ext3 and I store lots of large files (15GB free on the array right now). I have about 156GB of DVD images, mostly movies that I own and have ripped to watch using daemon tools on Windows. Some of them are rips of training video DVDs I bought for software that I use like Adobe Premiere and Audition. I frequently move large AVI files to and from the array for video projects that I'm working on. These files originate on my Windows box and can be as large as 13GB (for an hour of video footage). I've been using ext3 for years and it's never let me down or given me any problems.
    --
    Slashdot: Failed Car Analogies. Amateur Lawyering. Anecdote Battles.
  5. Re:Keep it simple. ext2 or fat32. by dougmc · · Score: 3, Informative
    There is no call for a complex filesystem just because you want to store large files. ext2 (and to some extent fat32) will do just fine
    fat32 cannot handle files over 4 GB in size at all. That alone probably renders it totally unsuitable for this person's needs.

    Beyond that, I'd say pretty much anything will work fine -- most of the optimizations found in filesystems are needed for lots of small files, not a few large files. For large files, the speeds they can be accessed by various filesystems are not likely to vary more than a few percent unless you let the files get fragmented (which probably isn't a big concern here.)

    And you are right -- if something does go wrong, ext2 or ext3 will probably give you the most options for recovering it. NTFS probably has even more recovery options (and FAT even more, as mentioned), but I'm guessing the OS will be *nix. But really, if your goal is reliability, you don't want some esoteric filesystem that can recover from disk errors (because ultimately, none can, though I guess one could be designed to keep ECC codes on the same disk transparantly -- but I'm aware of no such filesystem existing) -- you want multiple copies of your data. Keeping 5-10% (or more) par2 files for your archive can help a lot in recovering it if your media goes partially bad, and having md5sums or CRC32s of all archived files can help determine if you did recover something accurately, but really there's little subsitute for multiple copies of important data in multiple geographical locations. (And no -- RAID is not a subsitute for backups, no matter how many mirrored drives you have. Not that I saw anybody suggest this yet, but it seems to always come up in response to questions like this, so consider this to be a premptive mention of that.)

  6. Re:Ext3 or XFS. by baptiste · · Score: 5, Informative
    Check out the latest. What? 2003? Haven't there been any bug fixes since then?

    While it sucks you've lost data because of XFS, mant people use it heavily every day without issue (I'm one of them) I've deployed XFS across mail, database, and web servers without issue. Your statements about are total FUD. The reason the last 'release' was in 2003 is not long after that, XFS was accepted into the kernel itself. Thus there we no longer a need to 'release' XFS patches for the kernel. If you look at the command packages, you'll see them being updated on a regular basis.

    As for bugs, I think your statement of bugs not being fixed is incorrect as well. Check the closed bug list. You'll see many that are being closed. Also, in your open bug list above, it does appear rather long. But MANY of those bugs are from users who opened a bug saying 'XFS Crashed On Me' and then never followed up with more info. The XFS developers haven't cleaned many of those out it seems. Bugs in the 200s date from 2003, bugs from the 300's from 2004. Late 300's and 400's from 2005.

    So I hate you've had data loss - I wouldn't wish that on anybody (having experienced a RAID5 triple disk failure combined with backup tape failure. Thank goodness for OnTrack!) But don't post FUD about a filesystem that has performed very well for a lot of people and continues to be improved and innovative.

  7. FMWORM by davecb · · Score: 3, Informative
    Also spelled FM-WORM, a filesystem which looks like anormal NFS server but knows intimately whaqt needsto be done to deal with WORM disks.

    It's a commercial product from Siemens, which I used years ago for Sietec's large-scale imaging product.

    There is probably a Linux port: We ran it on almost everything in existance (;-))

    --dave

    --
    davecb@spamcop.net