A Good Filesystem for Storing Large Binaries?
jZnat asks: "I own hundreds of gigabytes of binary data, usually backed up from other mediums such as CDs and DVDs. However, I cannot figure out which filesystem would be best for storing all this reliably. What I'm looking for is a WORM-optimized FS that also has good journaling methods to prevent data loss due to some natural disaster while data is being shifted around. Trying something new for once, I tried using SGI's XFS due to its promising details, but I was met with countless IO errors after trying to write large amounts of data to it. I feel that Ext3 is not optimal for this; ReiserFS is too slow when it comes to reading large data files; and Reiser4 isn't mature enough to entrust my digital assets to. What filesystem would be most appropriate for these needs?"
jfs is about the only one not mentioned that in linux.
I use JFS on RAID 5, no errors, uptime of 200+ days currently. Handling large files 200-300MB each all day long. Excellent performance.
If Kerry was the answer, it must have been a stupid question.
The UN - The largest "political" cause of death.
I've been using JFS for about two months now, and it's been quite a plesant experiance with my anime storage. I run it on a 1.6TB array, four 400GB harddrives. It's preformance is damn fast from what I've observed copying too/from a JFS firewire drive. I trust it enough to keep data that I can't back up on it until I can get another identicle array to mirror with - only drive failure will seem to kill this FS, and these drives are about ~3 months old, so failure isn't that much of a concern anymore. I'd reccomend it for storage, havn't tried it as a system FS yet.
reiser or jfs are both solid for this kind of work, with large file and volume support. personally i swear by reiser for my 2tb volume, and have had no problems so far, although there is a minor speed penalty when working with several multi-gigabyte files at once, something to do with shared fs locks/mutexes i'd imagine.
OTOH JFS is quite stable, and though it has less of the elegance in feature set I find in reiser, tends to make up for it with enhanced ruggedness and its handling of large volume/files.
Really can't recommend anything else, as you say, reiser4 is still untested for reliability imho, xfs has issues that vary from kernel to kernel, and ext3 appears quite primitive in comparision, although its journaling seems comparable to the other choices.
JFS if you need the speed, its dead fast in large scales, slower with small files, otherwise Reiser3 is an excellent all-round performer.
The first rule of USENET is you do not talk about USENET.
I have a 3 TB XFS file system and a 10 TB XFS file system that are regularly accessed by multiple processes that read and write hundreds of gigabytes each without write errors and with excellent performance (several hundred MByte/s sustained). You may have other hardware or software issues if you're seeing errors with XFS. Try to figure out the root cause of your problem before you try another FS.
As a general rule, the latest and greatest stuff will be full of bugs. Give zfs a few years before you trust it with anything critical.
Comparison of FileSystems (from Wikipedia)
;P
Personally, I run two 300GB drives in RAID1 on UFS and am quite satisfied with it, but you seem to be incredibly, incredibly picky, so I'm sure you could find something wrong with it
ND
This statement is forty-five characters long.
Like someone else said -- try using badblocks(8) -- or just use dd to make sure you can read the entire partition without errors.
Bad disks do happen -- even new ones. Production code in Linux is generally very stable, and (unlike with windows), you can usually start with the presumption that things like I/O errors are caused by real hardware problems of some sort (even if it's just bad/loose cables).
Free Software: Like love, it grows best when given away.
The data integrity of ZFS would be especially good for large binaries. ZFS stores checksums in a way (IIRC) that files can be error detected and repaired automatically across a RAID. I think this is part of Sun's "self healing" marketing push. It would also give a heads-up on a failing hard drive in advance to the actual failure.
Slashdot: Failed Car Analogies. Amateur Lawyering. Anecdote Battles.
Beyond that, I'd say pretty much anything will work fine -- most of the optimizations found in filesystems are needed for lots of small files, not a few large files. For large files, the speeds they can be accessed by various filesystems are not likely to vary more than a few percent unless you let the files get fragmented (which probably isn't a big concern here.)
And you are right -- if something does go wrong, ext2 or ext3 will probably give you the most options for recovering it. NTFS probably has even more recovery options (and FAT even more, as mentioned), but I'm guessing the OS will be *nix. But really, if your goal is reliability, you don't want some esoteric filesystem that can recover from disk errors (because ultimately, none can, though I guess one could be designed to keep ECC codes on the same disk transparantly -- but I'm aware of no such filesystem existing) -- you want multiple copies of your data. Keeping 5-10% (or more) par2 files for your archive can help a lot in recovering it if your media goes partially bad, and having md5sums or CRC32s of all archived files can help determine if you did recover something accurately, but really there's little subsitute for multiple copies of important data in multiple geographical locations. (And no -- RAID is not a subsitute for backups, no matter how many mirrored drives you have. Not that I saw anybody suggest this yet, but it seems to always come up in response to questions like this, so consider this to be a premptive mention of that.)
Nothing, the programmers will tell you that themselves. Journalled filesystems aren't for protecting your files. They're for protecting your filesystems.
The point of a journal is that you can roll back to a consistent state of the filesystem easily in case of error -- not that you can roll back to a consistent state for a given file (or indeed any file on the filesystem). In point of fact, it's usually more difficult to recover actual data from a journalled filesystem than a traditional one, because the writing process is much more complex. What's more, if an automatic fsck is needed, it's actually a little more likely to lose some data on a journalled filesystem because a non-journal filesystem recovers based on the found files while a journalled one recovers based on a separately recorded journal (you do put your journals on a separate block device from the filesystem, right? Otherwise you're mostly wasting your effort.)
The main point of a journalled filesystem is that when both redundant circuits in your datacenter go at once (which happens) the boxes will be able to at least boot and get through the automatic fscks without a tech needing to drive out there and run the fscks himself.
All's true that is mistrusted
For all my WORM disks, I use either ISO 9660 or ISO 9660/UDF bridge format.
Yeah, I simply burn CD-Rs or DVDs. DVDs have the nice property of being easily stored off-site. And files are in nice large contiguous block so even if the filesystem dies you can still recover a lot. Unlike XJFReiFS 2.3.1.5, DVDs will be readable in 50 years time.
And if you need to burn really large files, just use, well, zip. And perhaps some par2 files.
Though, seriously, they're coming up with a UDF variant for hard drives too.
SCO employee? Check out the bounty
While it sucks you've lost data because of XFS, mant people use it heavily every day without issue (I'm one of them) I've deployed XFS across mail, database, and web servers without issue. Your statements about are total FUD. The reason the last 'release' was in 2003 is not long after that, XFS was accepted into the kernel itself. Thus there we no longer a need to 'release' XFS patches for the kernel. If you look at the command packages, you'll see them being updated on a regular basis.
As for bugs, I think your statement of bugs not being fixed is incorrect as well. Check the closed bug list. You'll see many that are being closed. Also, in your open bug list above, it does appear rather long. But MANY of those bugs are from users who opened a bug saying 'XFS Crashed On Me' and then never followed up with more info. The XFS developers haven't cleaned many of those out it seems. Bugs in the 200s date from 2003, bugs from the 300's from 2004. Late 300's and 400's from 2005.
So I hate you've had data loss - I wouldn't wish that on anybody (having experienced a RAID5 triple disk failure combined with backup tape failure. Thank goodness for OnTrack!) But don't post FUD about a filesystem that has performed very well for a lot of people and continues to be improved and innovative.
Top Most Bizarre/Disturbing Error Messages
What you're looking for is Universal Disk Format or UDF.
It is an open standard supported by all of the major OSes and manufacturers and is the filesystem of choise for Ultra Density Optical WORM and rewritable disks.
There a drivers for Linux, Windows and all of the major UNIXes. Here is the obligatory Wikipedia entry.
Hard disk filesystems like XFS, JFS, Reiser, ZFS etc. are all wonderful at what they do but they are unsuitable for WORM disks.
Stick Men
Others have said good things in general (XFS,JFS,ext3).
I looked into filesystem comparisons in setting up a MythTV box.
My issues were:
(1) efficient use of hard drive space, and
(2) performance.
Efficient use = filesystem settings have a big effect on amount of usable space.
For ext2/3:
-m 0 = setting 'reserved space for root' to 0%. Default is 5%, which can be 10-20 GB these days, all unusable to non-root users
-T ____ = can tell ext2/3 to optimize inodes and byte-per-inode for different size average files. Largefile versus news spools (tons of small files). Because of the way that a file can be spread out and mapped across the filesystem, this has an effect on 'wasted' space, and maybe performance (# of inode entries per file to lookup).
-b, -i - can set total # of inodes and bytes-per-inode directly. Advanced control over filesystem creation
I never got around to looking into this detail for XFS/JFS - they seem have fewer such options.
Performance I'll leave it to others to talk about filesystem performance with largefiles in general.
MythTV takes a lot of writing, and as it turns out, deleting, of large temporary files for the TV features (records, pause, FF/RR). After some reading online, I've found MythTV performance is drastically impacted by filesystem choice due to all of the deleting.
http://www.mythtv.org/docs/mythtv-HOWTO-24.html#ss 24.2
http://www.gossamer-threads.com/lists/mythtv/users /52672
---SNIP---
> My last reply to myself. Based on a Googled reference, I was able to
> break my XFS 4G file size barrier by formatting the partition 'mkfs.xfs
> -dagsize=4g'. So, here are the complete results:
>
> Time to delete a 10G file, fastest to slowest:
>
> JFS: 0.9s, 0.9s
> XFS: 1.3s
> EXT3: 1.4s, 2.3s
> EXT2: 1.6s
> REISERFS: 6.2s
> EXT3 -T largefile4: 5.9s, 10.2s
>
> After running the XFS test, there didn't seem to be any point in
> reformatting the partition again, so I left it on XFS, but I think I
> would be happy with JFS, XFS, or EXT3 w/o '-T largefile4'.
>>>>
wepprop at sbcglobal
Feb 8, 2004, 2:33 AM
Post #21 of 22 (4121 views)
Re: Changing filesystems? [In reply to]
Robert Kulagowski wrote:
> Interesting. If others care to weigh in, I can either re-write the
> "Advanced Partitioning" section in the HOWTO, or whack it completely.
>
> William, can you give some background on the hardware used for your
> tests? I'd be curious if this data holds up across various drive types,
> LVM, etc. (Without trying to exhaustively test all the possibilities,
> that is)
It appears, based on my personal experience alone, that file deletes are
the only system operations that can stress the hard drive enough to
produce dropped frames. Unfortunately, as others have pointed out,
recordings and deletions go together in Myth. So, unusual as it may be,
it does make at least some sense to take file deletion performance into
account when deciding which filesystem to use for a video partition,
especially for people with multiple tuners.
The really ironic result from my personal perspective is that it would
appear that using the '-T largefile4' setting for ext3, which I was so
pleased with because it give me an extra 2G of storage, may well have
been responsible for all those recordings I had ruined by frame drops.
Assuming it works out, though, I could really get to like this XFS
filesystem because it appears to give me slightly more storage space
than ext3 w/ '-T largefile4' did and it has pretty fast deletes as well.
---SNIP---
Honestly, since I submitted this about a couple months ago, I just formatted the disks to ext3 and it's worked quite well since then, but any better ideas are always welcome.
'Yes, firefox is indeed greater than women. Can women block pops up for you? No. Can Firefox show you naked women? Yes.'
It's a commercial product from Siemens, which I used years ago for Sietec's large-scale imaging product.
There is probably a Linux port: We ran it on almost everything in existance (;-))
--dave
davecb@spamcop.net