Slashdot Mirror


Ask Slashdot: Best *nix Distro For a Dynamic File Server?

An anonymous reader (citing "silly workplace security policies") writes "I'm in charge of developing for my workplace a particular sort of 'dynamic' file server for handling scientific data. We have all the hardware in place, but can't figure out what *nix distro would work best. Can the great minds at Slashdot pool their resources and divine an answer? Some background: We have sensor units scattered across a couple square miles of undeveloped land, which each collect ~500 gigs of data per 24h. When these drives come back from the field each day, they'll be plugged into a server featuring a dozen removable drive sleds. We need to present the contents of these drives as one unified tree (shared out via Samba), and the best way to go about that appears to be a unioning file system. There's also requirement that the server has to boot in 30 seconds or less off a mechanical hard drive. We've been looking around, but are having trouble finding info for this seemingly simple situation. Can we get FreeNAS to do this? Do we try Greyhole? Is there a distro that can run unionfs/aufs/mhddfs out-of-the-box without messing with manual recompiling? Why is documentation for *nix always so bad?""

18 of 234 comments (clear)

  1. Wow by Anonymous Coward · · Score: 5, Insightful

    I know I’m not going to be the first person to ask this, but if I understand it the plan here was:

    1 - buy lots of hardware and install
    2 - think about what kind of software it will run and how it will be used

    I think you got your methodology swapped around man!

    Why is documentation for *nix always so bad?

    You are looking for information that your average user won’t care about. Things like boot time don’t get documented because your average user isn’t going to have some arbitrary requirement to have their _file server_ boot in 30 seconds. That’s a very weird use case. Normally you reboot a file server infrequently (unless you want to be swapping disks out constantly..). I’m assuming this requirement is because you plan on doing a full shutdown to insert your drives... in which case you really should be looking into hotswap

    Also mandatory: you sound horribly underqualified for the job you are doing. Fess up before you waste even more (I assume grant) money and bring in someone that knows what the hell they are doing.

    1. Re:Wow by LodCrappo · · Score: 4, Insightful

      I know I’m not going to be the first person to ask this, but if I understand it the plan here was:

      1 - buy lots of hardware and install
      2 - think about what kind of software it will run and how it will be used

      I think you got your methodology swapped around man!

      Why is documentation for *nix always so bad?

      You are looking for information that your average user won’t care about. Things like boot time don’t get documented because your average user isn’t going to have some arbitrary requirement to have their _file server_ boot in 30 seconds. That’s a very weird use case. Normally you reboot a file server infrequently (unless you want to be swapping disks out constantly..). I’m assuming this requirement is because you plan on doing a full shutdown to insert your drives... in which case you really should be looking into hotswap

      Also mandatory: you sound horribly underqualified for the job you are doing. Fess up before you waste even more (I assume grant) money and bring in someone that knows what the hell they are doing.

      Wow.. I completely agree with an AC.

      The OP here is in way over his head and the entire project seems to have been planned by idiots.

      This will end badly.

      --
      -Lod
    2. Re:Wow by mschaffer · · Score: 4, Informative

      [...]

      Wow.. I completely agree with an AC.

      The OP here is in way over his head and the entire project seems to have been planned by idiots.

      This will end badly.

      Like that's the first time. However, we don't know all of the circumstances and I wouldn't be surprised that the OP had this dropped into his/her lap.

    3. Re:Wow by arth1 · · Score: 4, Informative

      Yeah. Before we can answer this person's questions, we need to know why he has:
      1: Decided to cold-plug drives and reboot
      2: Decided to use Linux
      3 ... to serve to Windows

      Better yet, tell us what you need to do - not how you think you should do it. Someone obviously needs to read data that's collected, but all the steps in between should be based on how it can be collected and how it can be accessed by the end users. Tell us those parameters first, and don't throw around words like Linux, samba, booting, which may or may not be a solution. Don't jump the gun.

      As for documentation, no other OSes are as well-documented as Linux/Unix/BSD.
      Not only are there huge amounts of man pages, but there are so many web sites and books that it's easy to find answers.

      Unless, of course, you have questions like how fast a distro will boot, and don't have enough understanding to see that that that depends on your choice of hardware, firmware and software.
      I have a nice Red Hat Enterprise Linux system here. It takes around 15 minutes to boot. And I have another Red Hat Enterprise Linux system here. It boots in less than a minute. The first one is -- by far -- the better system, but enumerating a plaided RAID of 18 drives takes time. That's also irrelevant, because it has an expected shutdown/startup frequency of once per two years.

    4. Re:Wow by Anonymous Coward · · Score: 4, Informative

      Op here:

      The gear was sourced from a similar prior project that's no longer needed, and we don't have the budget/authorization to buy more stuff. Considering that the requirements are pretty basic, we weren't expecting to have a serious issue picking the right distro.

      >You are looking for information that your average user won’t care about.

      Granted, but I thought one of the strengths of *nix was that it's not confined to computer illiterates. Some geeks somewhere should know which distros can be stripped down to bare essentials with a minimum of fuss.

      As for the 30 seconds thing, there's a lot side info I left out of the summary. This project is quirky for a number of reasons, and one of them being that the server itself spends a lot of time off, and needs to be booted (and halted) on demand. (Don't ask, it's a looooooong story).

    5. Re:Wow by Alex+Belits · · Score: 4, Funny

      I worked with networked computers in professional capacity longer than all of you combined, and I completely agree with the person you are replying to.

      You are absolutely definitely unqualified to make any design decisions about the project you have described. The design is stupid, requirements are idiotic, and if iot was implemented in such manner it would not work for many reasons that you don't seem to be capable of understanding.

      On top of that massive ignorance, you are stupid.

      --
      Contrary to the popular belief, there indeed is no God.
  2. I would automate the copying by guruevi · · Score: 4, Informative

    Really, singular hard drives are notoriously bad at keeping data around for long. I would make sure you have a copy of everything. So make a file server with RAIDZ2 or RAID6 and script the copying of these hard drives onto a system that has redundancy and is backed up as well.

    How many times I have seen scientist come out with their 500GB portable hard drives and they are unreadable... way too much. If you fill 500GB in 24 hours, there is no way a portable hard drive will survive for longer than about a year. Most of our drives (500GB 2.5" portable drives) last a few months, once they have processed about 6TB of data full-time they are pretty much guaranteed to fail.

    --
    Custom electronics and digital signage for your business: www.evcircuits.com
  3. Re:Do you need a unified filesystem at all? by Anrego · · Score: 4, Insightful

    I have to assume they are using some clunky windows analysis program or something that lacks the ability to accept multiple directories or something.

    Either way, the aufs (or whatever they use) bit seems to be the least of their worries. They bought an installed a bunch of gear and are just now looking into what to do with it, and they've decided they want it to boot in 30 seconds (protip: high end gear can take this long just doing it's self checks, which is a good thing! Fast booting and file server don't go well together).

    Probably a summer student or the office "tech guy" running things. They'd be better off bringing in someone qualified.

  4. What Greyhole isn't by NemoinSpace · · Score: 4, Insightful
    • Enterprise-ready: Greyhole targets home users.

    Not sure why the 30s boot up requirement is there, so it depends on what you define as "booted" . Spinning up 12 hard drives and making them available through Samba within 30s guarantees your costs will be 10x more than they need to be.
    This isn't another example of my tax dollars at work is it?

  5. Re:CentOS, its enterprise class by wytcld · · Score: 4, Insightful

    "Enterprise class" is a marketing slogan. In the real world, all the RH derivatives are pretty good (including Scientific Linux and Fedora as well as CentOS), and all the Debian derivatives are pretty good (including Ubuntu). Gentoo's solid too. "Enterprise class" doesn't mean much. The main thing that characterizes CentOS from Scientific Linux - which is also just a recompile of the RHEL code - is that the CentOS devs have "enterprise class" attitude. Meanwhile, RH's own devs are universally decent, humble people. Those who do less often thing more of themselves.

    For a great many uses, Debian's going to be easiest. But it depends on just what you need to run on it, as different distros do better with different packages, short of compiling from source yourself. No idea what the best solution is for the task here, but "CentOS" isn't by itself much of an answer.

    --
    "with their freedom lost all virtue lose" - Milton
  6. Re:Do you need a unified filesystem at all? by Anonymous Coward · · Score: 4, Informative

    OP here:

    I left out a lot of information from the summary in order to keep the word count down. Each disk has an almost identical directory structure, and so we want to merge all the drives in such a way that when someone looks at "foo/bar/baz/" they see all the 'baz' files from all the disks in the same place. While the folders will have identical names the files will be globally unique, so there's no concern about namespace collisions at the bottom levels.

  7. ZFS Filesystem will help by Anonymous Coward · · Score: 4, Insightful

    500G in a 24h period sounds like it will be highly compressible data. I would recommend FreeBSD or Ubuntu with ZFS Native Stable installed. ZFS will allow you to create a very nice tree with each folder set to a custom compression level if necessary. (Don't use dedup) You can put one SSD in as a cache drive to accelerate the shared folders speed. I imagine there would be an issue with restoring the data to magnetic while people are trying to read off the SMB share. An SSD cache or SSD ZIL drive for ZFS can help a lot with that.

    Some nagging questions though.
    How long are you intending on storing this data? How many sensors are collecting data? Because even with 12 drive bay slots, assuming cheap SATA of 3TB a piece. (36TB total storage with no redundancy), lets say 5 sensors, thats 2.5TB a day data collection, and assuming good compression of 3x, 833GB a day. You will fill up that storage in just 43 days.

    I think this project needs to be re-thought. Either you need a much bigger storage array, or data needs to be discarded very quickly. If the data will be discarded quickly, then you really need to think about more disk arrays so you can use ZFS to partition the data in such a way that each SMB share can be on its own set of drives so as to not head thrash and interfere with someone else who is "discarding" or reading data.

  8. Re:Mechanical Hard Drive by davester666 · · Score: 4, Funny

    They already bought a $20 5400rpm 80Gb drive and don't want it to be wasted.

    --
    Sleep your way to a whiter smile...date a dentist!
  9. waaaay over head by itzdandy · · Score: 4, Insightful

    What is the point of 30 second boot on a file server? If this is on the list of 'requirements', then the 'plan' is 1/4 baked. 1/2 baked for buying hardware without a plan, then 1/2 again for not having a clue.

    unioning filesystem? what is the use scenario? how about automounting the drives on hot-plug and sharing the /mnt directory?

    Now, 500GB/day in 12 drive sleds....so 6TB a day? do the workers get a fresh drive each day or is the data only available for a few hours before it gets sent back out or are they rotated? I suspect that mounting these drives for sharing really isnt what is necessary, more like pull contents to 'local' storage. Then, why talk about unioning at all, just put the contents of each drive in a separate folder.

    Is the data 100% new each day? Are you really storing 6TB a day from a sensor network? 120TB+ a month?

    Are you really transporting 500GB of data by hand to local storage and expecting the disks to last? reading or writing 500GB isn't a problem, but constant power cycling and then physically moving/shaking the drives around each day to transport is going to put the MTBF of these drives in months not years.

    dumb

  10. Re:Do you need a unified filesystem at all? by Anonymous Coward · · Score: 4, Informative

    FreeNAS is based on FreeBSD, and boot speed (no matter what the OS) is based entirely on the hard drive speed + CPU speed + 'automagic' configuration.

    FreeBSD boots pretty fast, but you need to turn off things like the bootloader menu delay, and set fixed IP addresses. Same on Linux, but Linux tends to be sloppy about starting up services.

    In either case you can usually just turn anything you don't need off, and just turn on what you do need.

    FreeBSD's ZFS is better than anything you can setup on Linux, but unless the box has a lot of RAM you're not going to get the expected performance.

    Most of the NAS devices you see for sale run FreeNAS if they're based on x86-64 CPU's or Linux if they're not (PPC/MIPS/ARM) but they're not particuarly great pieces of hardware, you pretty much end up with something stupid silly like:
    OS -> UFS/EXT2/EXT3 -> Samba share
    for Windows clients, but you can also do this on FreeBSD/FreeNAS (ZFS is terrible under Linux-FUSE)
    FREEBSD->ZFS (using all drives, even remote drives) -> iSCSI
    iSCSI is something that you must have GigE/10GB Fiber for, and decent processing power. Most of the systems you see (including DELL) that do iSCSI are woefully underpowered for a small server, or extremely overkill (enterprise)

    Windows however supports iSCSI out of the box. So you can do something theoretically stupid like this:
    FreeBSD -> ZFS ->iSCSI ->Windows box accesses iSCSI and shares it with other Windows machines.

    So it depends what you really want to do. From your description, it sounds like what you really want to do is hotplug a bunch of drives into a system, that system is "union"'d by filesystem mounts (nobody says you have to mount everything to root) and the share them under that samba.

    But another possibility, not clearly indicated is that maybe the drives have overlapping file systems that you want to see as one (eg same directory structure, different file names) this is more complicated to deal with, but I'd probably go with not trying to share off the hotswapped drives and instead RSYNC all the drives to another filesystem and share that instead.

  11. OP here by Anonymous Coward · · Score: 5, Informative

    Ok, lots of folks asking similar questions. In order to keep the submission word count down I left out a lot of info. I *thought* most of it would be obvious, but I guess not.

    Notes, in no particular order:

    - The server was sourced from a now-defunct project with similar setup. It's a custom box with non-normal design. We don't have authorization to buy more hardware. That's not a big deal because what we have already *should* be perfectly fine.

    - People keep harping on the 30 seconds thing.
    The system is already configured to spin up all the drives simultaneously (yes the PSU can handle that) and get through the bios all in a few seconds. I *know* you can configure most any distro to be fast, the question is how much fuss it takes to get it that way. Honestly I threw that in there as an aside, not thinking this would blow up into some huge debate. All I'm looking for are pointers along the lines of "yeah distro FOO is bloated by default, but it's not as bad as it looks because you can just use the BAR utility to turn most of that off". We have a handful of systems running winXP and linux already that boot in under 30, this isn't a big deal.

    - The drives in question have a nearly identical directory structure but with globally-unique file names. We want to merge the trees because it's easier for people to deal with than dozens of identical trees. There are plenty of packages that can do this, I'm looking for a distro where I can set it up with minimal fuss (ie: apt-get or equivalent, as opposed to manual code editing and recompiling).

    - The share doesn't have to be samba, it just needs to be easily accessible from windows/macs without installing extra software on them.

    - No, I'm not an idiot or derpy student. I'm a sysadmin with 20 years experience (I'm aware that doesn't necessarily prove anything). I'm leaving out a lot of detail because most of it is stupid office bureaucracy and politics I can't do anything about. I'm not one of those people who intentionally makes things more complicated than they need to be as some form of job security. I believe in doing things the "right" way so those who come after me have a chance at keeping the system running. I'm trying to stick to standards when possible, as opposed to creating a monster involving homegrown shell scripts.

  12. Not gonna happen. by Anonymous Coward · · Score: 5, Insightful

    You have to be able to identify the disks being mounted. Since these are hot swappable, they will not be automatically identifiable.

    Also note, not all disks spin up at the same speed. Disks made for desktops are not reliable either - though they tend to spin up faster. Server disks might take 5 seconds before they are failed. You also seem to have forgotten that even with all disks spun up, each must be read (one at a time) for them to be mounted.

    Hot swap disks are not something automatically mounted unless they are known ahead of time - which means they have to have suitable identification.

    UnionFS is not what you want. That isn't what it was designed for. Unionfs only has one drive that can be written to - the top one in the list. Operations on the other disks force it to copy it to the top disk for any modifications. Deletes don't happen to any but the top disk.

    Some of what you discribe is called an HSM (hierarchical storage management), and requires a multi-level archive where some volumes may be on line, others off line, yet others in between. Boots are NOT fast, mostly due to the need to validate the archive first.

    Back to the unreliability of things - if even one disk has a problem, your union filesystem will freeze - and not nicely either. The first access to a file that is inaccessable will cause a lock on the directory. That lock will lock all users out of that directory (they go into an infinite wait). Eventually, the locks accumulate to include the parent directory... which then locks all leaf directories under it. This propagates to the top level when the entire system freezes - along with all the clients. This freezing nature is one of the things that a HSM handles MUCH better. A detected media error causes the access to abort, and that releases the associated locks. If the union filesystem detects the error, then the entire filesystem goes down the tubes, not just one file on one disk.

    Another problem is going to be processing the data - I/O rates are not good going through a union filesystem yet. Even though UnionFS is pretty good at it, expect the I/O rate to be 10% to 20% less than maximum. Now client I/O has to go through a network connection, so that may make it bearable. But trying to process multiple 300 GB data sets in one day is not likely to happen.

    Another issue you have ignored is the original format of the data. You imply that the filesystem on the server will just "mount the disk" and use the filesystem as created/used by the sensor. This is not likely to happen - trying to do so invites multiple failures; it also means no users of the filesystem while it is getting mounted. You would do better to have a server disk farm that you copy the data to before processing. That way you get to handle the failures without affecting anyone that may be processing data, AND you don't have to stop everyone working just to reboot. You will also find that local copy rates will be more than double what the servers client systems can read anyway.

    As others have mentioned, using gluster file system to accumulate the data allows multiple systems to contribute to the global, uniform, filesystem - but it does not allow for plugging in/out disks with predefined formats. It has a very high data throughput though (due to the distributed nature of the filesystem), and would allow many systems to be copying data into the filesystem without interference.

    As for experience - I've managed filesystems with up to about 400TB in the past. Errors are NOT fun as they can take several days to recover from.

  13. Ask the correct community : science informatics by oneiros27 · · Score: 4, Informative

    What you're describing sounds like a fairly typical Sensor Net (or Sensor Web) to me, maybe with a little more data logged than is normal per platform. (I believe they call it a 'mote' in that community).

    Some of the newer sensor nets use a forwarding mesh wireless system, so that you relay the data to a highly reduced number of collection points -- which might keep you from having to deal with the collection of the hard drives each night (maybe swap out a multi-TB RAID at each collection point each night instead).

    I'm not 100% sure of what the correct forum is for discussion of sensor/platform design. I know they have presentations in the ESSI (Earth and Space Science Informatics) focus group of the AGU (American Geophysical Union). Many of the members of ESIPfed (Federation of Earth Science Information Partners) probably have experience in these issues, but it's more about discussing managing the data after it comes out of the field.

    On the off chance that someone's already written software to do 90% of what you're looking for, I'd try contacting the folks from the Software Reuse Working Group of the Earth Science Data System community.

    You might also try looking through past projects funded through NASA AISR (Adanced Information Systems Research) ... they funded better sensor design & data distribution systems. (unfortunately, they haven't been funded for a few years ... and I'm having problems accessing their website right now). Or I might be confusing it with the similar AIST (Adanced Information Systems Technology), which tends more towards hardware vs. software. ... so, my point is -- don't roll your own. Talk to other people who have done similar stuff, and build on their work, otherwise you're liable to make all of the same mistakes, and waste a whole lot of time. And in general (at least ESSI / ESIP-wide), we're a pretty sharing community ... we don't want anyone out there wasting their time doing the stupid little piddly stuff when they could actually be collecting data or doing science.

    (and if you haven't guessed already ... I'm an AGU/ESSI member, and I think I'm an honorary ESIP member (as I'm in the space sciences, not earth science) ... at least they put up with me on their mailing lists)

    --
    Build it, and they will come^Hplain.