Slashdot Mirror


Which OSS Clustered Filesystem Should I Use?

Dishwasha writes "For over a decade I have had arrays of 10-20 disks providing larger than normal storage at home. I have suffered twice through complete loss of data once due to accidentally not re-enabling the notification on my hardware RAID and having an array power supply fail and the RAID controller was unable to recover half of the entire array. Now, I run RAID-10 manually verifying that each mirrored pair is properly distributed across each enclosure. I would like to upgrade the hardware but am currently severely tied to the current RAID hardware and would like to take a more hardware agnostic approach by utilizing a cluster filesystem. I currently have 8TB of data (16TB raw storage) and am very paranoid about data loss. My research has yielded 3 possible solutions: Luster, GlusterFS, and Ceph." Read on for the rest of Dishwasha's question. "Lustre is well accepted and used in 7 of the top 10 supercomputers in the world, but it has been sullied by the buy-off of Sun to Oracle. Fortunately the creator seems to have Lustre back under control via his company Whamcloud, but I am still reticent to pick something once affiliated with Oracle and it also appears that the solution may be a bit more complex than I need. Right now I would like to reduce my hardware requirements to 2 servers total with an equal number of disks to serve as both filesystem cluster servers and KVM hosts."

"GlusterFS seems to be gaining a lot of momentum now having backing from Red Hat. It is much less complex and supports distributed replication and directly exporting volumes through CIFS, but doesn't quite have the same endorsement as Lustre."

"Ceph seems the smallest of the three projects, but has an interesting striping and replication block-level driver called Rados."

"I really would like a clustered filesystem with distributed, replicated, and striped capabilities. If possible, I would like to control the number of replications at a file level. The cluster filesystem should work well with hosting virtual machines in a high-available fashion thereby supporting guest migrations. And lastly it should require as minimal hardware as possible with the possibility of upgrading and scaling without taking down data."

"Has anybody here on Slashdot had any experience with one or more of these clustered file systems? Are there any bandwidth and/or latency comparisons between them? Has anyone experienced a failure and can share their experience with the ease of recovery? Does anyone have any recommendations and why?"

5 of 320 comments (clear)

  1. Repeat after me: by Anonymous Coward · · Score: 5, Insightful

    RAID is not a backup solution!

    1. Re:Repeat after me: by NFN_NLN · · Score: 5, Insightful

      Parent currently is marked as "0" but is dead on. His opening statement talks about a data loss (x2), is "very paranoid about data loss" and his closing remarks talk about "ease of recovery". Your statements suggest you are primarily concerned about data loss.

      Clustered filesystems are complex software that specialize in concurrent server access, not increased redundancy.

      You need to research backups and/or remote replication. Or buy an enterprise file server that does everything including call-home when it detects a hardware issue.. not waste time on a CFS.

    2. Re:Repeat after me: by NFN_NLN · · Score: 5, Insightful

      Except when they do support redundancy:

      http://www.gluster.com/community/documentation/index.php/Gluster_3.2:_Creating_Replicated_Volumes - Replicated volumes replicate files throughout the bricks in the volume. You can use replicated volumes in environments where high-availability and high-reliability are critical.

      RAID is still NOT A BACKUP!

      I have a 500 node replicated filesystem... and I just overwrote the wrong file, or a virus infected a file, or the file got corrupted...

      The good news is my 500 replicated nodes are all consistent. The bad news is... wheres my fucking file!

  2. Obligatory: RAID is not a backup by Anthony+Mouse · · Score: 5, Insightful

    Is the only reason you're looking at a clustered filesystem that you don't want to lose data? Because if it is, it's probably not what you want. The purpose of a clustered filesystem is to minimize downtime in the face of a hardware failure. You still need a backup in the case of a software failure or in case you fat finger something, because a mass deletion can replicate to all copies.

  3. I know this isn't what you asked but... by KendyForTheState · · Score: 5, Interesting

    20 disks seems like overkill for your storage needs. Seems like the more disks you use the greater the risk of failure of one or more of them. Also, your electricity bill must be through the roof. I have 4 3TB drives with a 3Ware controller in RAID5 array which gives me the same storage capacity with 1/5th the drives Aren't you making this more complicated than it needs to be? ...Maybe that's the point?

    --
    ...I just came for the free beer.