Slashdot Mirror


Which OSS Clustered Filesystem Should I Use?

Dishwasha writes "For over a decade I have had arrays of 10-20 disks providing larger than normal storage at home. I have suffered twice through complete loss of data once due to accidentally not re-enabling the notification on my hardware RAID and having an array power supply fail and the RAID controller was unable to recover half of the entire array. Now, I run RAID-10 manually verifying that each mirrored pair is properly distributed across each enclosure. I would like to upgrade the hardware but am currently severely tied to the current RAID hardware and would like to take a more hardware agnostic approach by utilizing a cluster filesystem. I currently have 8TB of data (16TB raw storage) and am very paranoid about data loss. My research has yielded 3 possible solutions: Luster, GlusterFS, and Ceph." Read on for the rest of Dishwasha's question. "Lustre is well accepted and used in 7 of the top 10 supercomputers in the world, but it has been sullied by the buy-off of Sun to Oracle. Fortunately the creator seems to have Lustre back under control via his company Whamcloud, but I am still reticent to pick something once affiliated with Oracle and it also appears that the solution may be a bit more complex than I need. Right now I would like to reduce my hardware requirements to 2 servers total with an equal number of disks to serve as both filesystem cluster servers and KVM hosts."

"GlusterFS seems to be gaining a lot of momentum now having backing from Red Hat. It is much less complex and supports distributed replication and directly exporting volumes through CIFS, but doesn't quite have the same endorsement as Lustre."

"Ceph seems the smallest of the three projects, but has an interesting striping and replication block-level driver called Rados."

"I really would like a clustered filesystem with distributed, replicated, and striped capabilities. If possible, I would like to control the number of replications at a file level. The cluster filesystem should work well with hosting virtual machines in a high-available fashion thereby supporting guest migrations. And lastly it should require as minimal hardware as possible with the possibility of upgrading and scaling without taking down data."

"Has anybody here on Slashdot had any experience with one or more of these clustered file systems? Are there any bandwidth and/or latency comparisons between them? Has anyone experienced a failure and can share their experience with the ease of recovery? Does anyone have any recommendations and why?"

14 of 320 comments (clear)

  1. Repeat after me: by Anonymous Coward · · Score: 5, Insightful

    RAID is not a backup solution!

    1. Re:Repeat after me: by NFN_NLN · · Score: 5, Insightful

      Parent currently is marked as "0" but is dead on. His opening statement talks about a data loss (x2), is "very paranoid about data loss" and his closing remarks talk about "ease of recovery". Your statements suggest you are primarily concerned about data loss.

      Clustered filesystems are complex software that specialize in concurrent server access, not increased redundancy.

      You need to research backups and/or remote replication. Or buy an enterprise file server that does everything including call-home when it detects a hardware issue.. not waste time on a CFS.

    2. Re:Repeat after me: by NFN_NLN · · Score: 5, Insightful

      Except when they do support redundancy:

      http://www.gluster.com/community/documentation/index.php/Gluster_3.2:_Creating_Replicated_Volumes - Replicated volumes replicate files throughout the bricks in the volume. You can use replicated volumes in environments where high-availability and high-reliability are critical.

      RAID is still NOT A BACKUP!

      I have a 500 node replicated filesystem... and I just overwrote the wrong file, or a virus infected a file, or the file got corrupted...

      The good news is my 500 replicated nodes are all consistent. The bad news is... wheres my fucking file!

  2. Re:You Should... by RobDollar · · Score: 2, Insightful

    If ever, this article is the case for your comment. Dishwasha, what the living fuck are you doing with your life. Answer that and then maybe, just maybe, coherent answers will abound.

  3. Obligatory: RAID is not a backup by Anthony+Mouse · · Score: 5, Insightful

    Is the only reason you're looking at a clustered filesystem that you don't want to lose data? Because if it is, it's probably not what you want. The purpose of a clustered filesystem is to minimize downtime in the face of a hardware failure. You still need a backup in the case of a software failure or in case you fat finger something, because a mass deletion can replicate to all copies.

  4. Re:ReiserFS by KendyForTheState · · Score: 3, Insightful

    Uh... he DID confess to the crime AND lead the cops to his wife's body. I know...sarcasm, right?

    --
    ...I just came for the free beer.
  5. You still need to make a decision by 93+Escort+Wagon · · Score: 4, Insightful

    You ask about the technical specifications; but, when commenting regarding the three likely candidates you found, you've put philosophical objections first and foremost. I think you first need to figure out which factor is more important to you - specs, or philosophy. Otherwise you're probably going to waste a lot of time arguing in circles.

    --
    #DeleteChrome
  6. I was going to say Lustre, but... by Anonymous Coward · · Score: 3, Insightful

    I was going to say Lustre, but then I saw that you only have 16TB. 15 years ago that would have been impressive, but these days, those supercomputers you mention probably have that much in DRAM, and their file storage is in the multi-petabyte range. Lustre is optimized for large scale clusters, in which you have entire nodes (a node is a computer, here) dedicated to I/O - bringing external data into the in-cluster network fabric, while other nodes are compute nodes - they don't talk to the outside world, except by getting data via the I/O nodes.

    That's why you'll see all this talk of OSSs and OSTs, as though they'd be distinct systems - on a large scale cluster they are.

    For only 16TB, what you want is a SAN, or maybe even a NAS.

    If you want open source, then go with openfiler. It supports pretty much everything. I haven't stress tested it, but it seems to work well for that order of magnitude of data.

  7. Bad Dog. Wrong Tree! by SmurfButcher+Bob · · Score: 3, Insightful

    You will spend all this effort to build this solution... and then your house will catch fire.

    On the good side, the fire department WILL manage to save the basement by filling it with 80,000 gallons of water at 2,000GPM per fire engine.

    Or, you'll be wiped out by a flood. Or a drunk will drive through the side of your house. Or you'll have a gas leak and the house will detonate. Or carpenter ants will eat away the floor joists.

    Raid is not a backup solution. Neither is replication... if you whack the data, it'll likely be replicated. If you get a compromised machine somewhere, files they touch will likely be replicated. They only thing you're creating is an overly complex hardware mitigation. If THAT is how you define "data preservation"... you're doing it wrong.

    Look more for a solution to move stuff offsite - a cheap pair of N routers running Tomato or OpenWRT, to a neighbor's house, and you reciprocate with each other. Bonus points if you use versions, transaction logs, journals, etc.

    --

    help me i've cloned myself and can't remember which one I am

  8. Stop with experimental shit by ArchieBunker · · Score: 1, Insightful

    Seriously stop with the experimental and filesystem projects still in beta. You need one that is matured and time tested. Do a bit of research. I don't even run RAID and have yet to permanently lose anything in probably 20 years.

    --
    Only the State obtains its revenue by coercion. - Murray Rothbard
  9. Re:No ZFS? by hjf · · Score: 3, Insightful

    ZFS isn't free anymore. It's all commercial and proprietary and no bugfixes or anything get released outside a big bad support contract with Oracle.

    If you want the free version you can still use v28 on FreeBSD and Solaris Express (no upgrades in over 1 year). Works great, the only thing you don't get is ZFS crypto (transparent encryption).

  10. Re:Thoughts on OCFS by afabbro · · Score: 3, Insightful

    Thank you, this is one of the few valid answers to my primary question which is of actual experience with clustered file systems. I don't think most of the responders got the clue that I'm looking for a solution that will hopefully scale over a decade's worth of time.

    There is a question of missing clues, but I don't think it's in the responders. You either asked your question poorly or you don't understand your problem. Your question centers of being "paranoid about data loss" and yet you're discussing technologies designed to manage concurrent access to a filesystem. Do you put in gigabit ethernet when you want faster USB performance?

    I'll likely be upgrading to a Super Micro 2U Twin with QDR Infiniband

    Give me a break...

    --
    Advice: on VPS providers
  11. Re:You Should... by CheshireDragon · · Score: 1, Insightful

    With as cunty as women are these days, why would anyone want a GF?

    --
    "That's right...I said it."
  12. Re:The Cloud, obviously. by jareth-0205 · · Score: 4, Insightful

    I would be grateful if this bit of 'humour' could not be posted to *every single vaguely cloud-related post*.

    http://linux.slashdot.org/comments.pl?sid=2356014&cid=36928876

    http://tech.slashdot.org/comments.pl?sid=1683582&cid=32542918

    http://tech.slashdot.org/comments.pl?sid=2499970&cid=37882212

    http://it.slashdot.org/comments.pl?sid=2489600&cid=37805882

    Christ. It was only mildly amusing to begin with, let it go.