Best Format For OS X and Linux HDD?
dogmatixpsych writes "I work in a neuroimaging laboratory. We mainly use OS X but we have computers running Linux and we have colleagues using Linux. Some of the work we do with Magnetic Resonance Images produces files that are upwards of 80GB. Due to HIPAA constraints, IT differences between departments, and the size of files we create, storage on local and portable media is the best option for transporting images between laboratories. What disk file system do Slashdot readers recommend for our external HDDs so that we can readily read and write to them using OS X and Linux? My default is to use HFS+ without journaling but I'm looking to see if there are better suggestions that are reliable, fast, and allow read/write access in OS X and Linux."
UFS would be the best option. Linux supports it with -rw since Kernel 2.6.30 (afaik) and OS X mounts UFS natively.
I have a similar problem, albeit on a smaller scale. I use unjournalled HFS+.
However, the problem is that HFS+ being a proper unix filesystem remembers UIDs and GIDs which are usually inappropriate when the disk is moved.
Is there any good way to get Linux to mount the filesystem and give every file the same UID and GID, like for non unix filesystems?
SJW n. One who posts facts.
By "HIPAA Constraints" I assume you mean the privacy rule. I would think that this rule would prevent you from using sneakernet to transmit files. Unless you're encrypting your portable disks, and somehow it doesn't sound like you are.
Fun reading:
http://www.computerworld.com/s/article/9141172/Health_Net_says_1.5M_medical_records_lost_in_data_breach
More like "I don't know if it's broke. I could be doing something so wrong it could end up on The Daily WTF. If what I'm doing is broke, could you help me fix it?"
and 4Gb cap...
Things in a rear mirror might be behind you
How the fuck is he supposed to store 80 GB files on a filesystem that maxes out at 4 GB?
NTFS or any other FUSE (MacFUSE) file system. However in a heterogeneous environment NTFS has the bonus of native Windows support.
There is NTFS-3G for Linux and Mac OS X
There is also an EXT2 Fuse FS (for Mac OS), and probably many other options.
Having said that, I have never had a problem with Linux's HFS+ write support.
I would have recommended ReiserFS, but the data might get buried somewhere and the system would not remember where it was....
Take Nobody's Word For It.
If you are only moving files from one system to another, and do not need to edit them on the portable drives, skip the filesystem and just use tar. Tar will happily write to and read from raw block devices... In fact, that is exactly what it was designed to do. A side benefit of this approach is that you won't lose any drive capacity to filesystem overhead.
Ask Slashdot: Where bad ideas meet poor googling skills.
You're storing it in the wrong format - there are all sorts of tools to convert to Analyse or DICOM format, which give you a managable frame-by-frame set of images rather than one huge one. Most tools to manipulate MRI data expect DICOM or Analyse anyhow (BrainVoyager, NISTools, etc).
If you really want to keep it all safe, use tarfiles to hold structured data, although if you do that you've made it big again.
Removable media are a daft long-term storage - use ad-hoc removable media solutions (or more ideally, scp) to move the data.
For every problem, there is at least one solution that is simple, neat, and wrong.
Windows doesn't play in here, it's OSX and Linux. Tossing NTFS into that would just be... wrong somehow.
For large sets, this will be our guide even unto death, for the LORD will work for each type of data it is applied to...
Hah! In my company we call it "NoTeFíeS" (for non Spanish-speaking people: "Don'tTrustIt").
Strength, balance, courage and reason. If you know what's this about, contact me!
Just tunnel NFS over SSH. I can't imagine how secure it would be to sneakernet any files around the office. If you need to encrypt the data at rest then either encrypt on the client or leverage an encrypted filesystem of a Decru type appliance.
OS X UFS has a very unfortunate limit as it doesn't support files over 4 GB. Or, there was no chance, I would format everything (especially USB) as UFS.
Lack of commercial quality disk tools like Disk Warrior if a true catastrophe happens is a problem too. Of course, fsck can do good things but after a true catastrophic filesystem issue, diskwarrior is a must. That was one of the things Professional Mac community had hard time explaining ZFS community.
As Apple was truly wise to completely document it down to a point you can even write a full feature defragmenter (iDefrag), HFS+ without journaling seems to be the best option. I am in video business and I have seen it deal with files way beyond 80GB without any issues. In fact, lots of OS X users who images their drives see it everyday too.
I don't know why journaling is not implemented, it is open and documented too. If a bit hassle happens, it sure deserves it since he deals with external drives which are just fit to journaling purposes.
Really, you need a gigabit network and transfer files over it using AFP and/or NFS and/or SMB. First of all HIPAA requires you to encrypt your hard drives which most researchers won't do (it's too difficult). Then you also got the problem what happens if the researchers (or somebody else) leaves with the data.
Solaris and by extension Nexenta have really good solutions for this. You can DIY a 40TB RAIDZ2 system for well under $18,000. If you use desktop SATA drives (which I wouldn't recommend but ZFS keeps it safe) for your data you can press that cost to $10 or $12k.
I work in the same environment as you (neuroimaging, large datasets), feel free to contact me privately for more info.
Custom electronics and digital signage for your business: www.evcircuits.com
If it's Mac OS X 10.6.x, you don't even n eed NTFS-3G, as the native NTFS driver has read / write capability. You just need to change the /etc/fstab entry for the volume to rw, and remount.
Slashdot still doesnâ(TM)t support Unicode after it was added to the HTML standard in 1997.
Roses are red,
Violets are blue,
All of my mod points
Would belong to you.
I'm using a USB Disk formatted under linux with UDF (yep, it's not limited to DVDs, there is a profile for hard disks). It can be used without problems under OSX (even Snow Leopard)
Marquise
-- I need a new sig.
who will wooosh the woooshers?
A simple NAS enclosure or NAS device might be what you are looking for. You can get a single drive NAS enclosure, and add a drive, that you can carry around just like a regular portable drive. You can move it between networks and use any connection method the NAS device happens to implement (SMB, FTP, NFS, etc). Some even let you optionally connect it directly via USB or eSATA to access the file system directly, and some may have encryption or other security features as well.
Of course, check to make sure you have permission and that connecting things to your network does not violate any policies. If connecting a network device directly to the your network is not permitted then perhaps you can add a second, dedicated, network card to the computers.
Ok everybody's occupied with surreal suggestions, but anyway:
*UDF* is quite awesome as a on disk format for LinuxOSX data exchange, because it has a file size limit around 128TB, supports all the posix permissions, hard and soft links and whatnots. There is a nice whitepaper summing it all up:
http://www.13thmonkey.org/documentation/UDF/UDF_whitepaper.pdf
If you want to use UDF on a hard disk, prepare it under linux: /dev/sdb (that's right, UDF takes the whole disk, now partitions)
1) Install uddftools
2) wipe the first few blocks of the hard disk, i.e. dd if=/dev/zero of=/dev/sdb bs=1k count=100
3) create the file system : mkudffs --media-type=hd --utf8
If you plug this into OSX, the drive will show up as "LinuxUDF". I am using this setup for years to move data between linux and OSX machines.
Marquise
-- I need a new sig.
This is dangerous advice. There are numerous reports of instability and NTFS volume corruption when forcing 10.6 to mount NTFS volumes R/W. Apple seems to have turned NTFS write off by default for a good reason, it's not done yet.
Windows doesn't play in here, it's OSX and Linux. Tossing NTFS into that would just be... wrong somehow.
Flamebait mod or not, there is a valid point. Though various NTFS drivers do allow read/write, the success isn't graven in stone. There are better alternatives in the Linux/OSX world. Keep in mind that losing this data becomes either costly (as in time=money, let's go make another set of copies to run to whatever office) or very bad (as in someone moved the files to the external instead of copying them) or both.
So, as good as the NTFS R/W drivers are getting, it's safer to use a file system that is known to be more stable and less error prone, such as HFS+ or UFS or one of the other suggestions. "Really good" shouldnt be an option in the medical world when "even better than 'really good'" is available, compatible, and easy to install on all systems involved.
StarTrekPhase2 - The Five Year Mission Continues!
That is an excellent solution, and arguably the best to the OP's problem printed. UDF works on Windows, OS X, Linux. Even AIX is happy with it and can write to it. So an external drive with this on it should definitely solve the problem.
We had almost exactly the same problem. Our fMRI work was done at University of Virginia on a Linux machine. naturally you don't want to tie up a $1500/hour data collection machine doing analysis. Our data was transferred immediately to the Neurological Institute to a multiboot machine. No patient data included at this point, so no HIPPA problems. The receiving box ran Linux initially since the analysis programs from NIH (primarily AFNI) were Linux based. Patient data got added here so HIPPA became an issue. The machine had multiple hard drive bays, all of which were removable, plug-and-play drives made from a kit that provided slide-in rails and a locking mechanism, otherwise were common, commercial drives. Externals would have been easier, but the guy who devised this had a rilly rilly good reason. I remember it was good, but not what it was. Anyway, the machine could boot other OSs, prep the drives, go back to the native Linux HFS+ and transfer/translate to the , it was transferred, the drive removed, packaged, and FedEx'd to the other analysis sites at Virginia Tech, NIH, and U.Va Wise. We were strictly experimental, no direct medical treatment, and so time was not an issue. With OS X being *nix, there's not a lot of reasons to go with one over the other except for convenience when it comes to what your data collection and analysis are running under. Unless yours run fine under OS X, I'd say stick with HFS+, and of course moderate that according to whether you have to share out the data and what those people are running. I wouldn't bother with supporting Windows, as they continually find new problems to have with large files. One comparison test showed no difference in analysis results, but they did have problems with Windows choking on the data files. Their test files were only 1.5 GB. ref: J Med Dent Sci. 2004 Sep;51(3):147-54. Comparison of fMRI data analysis by SPM99 on different operating systems. PMID: 15597820. My experience agreed with their results. As I said we had little call for Macs, so we didn't run enough of that to give a good test of whether it had the same kind of problems. Bottom line, we used what we needed to according to where it was going and what they needed it to be, but for our own use it made no sense to transfer it out of the OS that collection and analysis used, HFS. The system met with the approval of the biophysicist we worked with at U.Va, and he had been a grad student under Peter Fox when the latter developed SPM. OH YEAH: the good reason. If anyone else wanted to work with us, they didn't have to dig too deeply into techie stuff either hardware or software. We could send them a removable-drive kit to install, and send them a drive with bootable Linux, AFNI and data, all plug and play. If that might be useful to you (using externals instead of removables doesn't matter here) that's probably be another vote for HFS.
"I may be synthetic, but I'm not stupid." -- Bishop 341-B