Slashdot Mirror


Preventing Shutdown on Active NFS Servers?

Ed Almos asks: "Like many Slashdot Readers, I run a small network at home with a server and a number of desktops. The server holds all our files as NFS shares and doubles as a desktop machine should the need arise. Problems however occur if the server is shut down whilst there are NFS shares in use, the minimum disruption is a crashed desktop and a couple of times I have had to deal with corrupted files. Does anyone know of a way to prevent shutdown of a machine if someone else has drives mounted to its NFS shares ? I have already explored use of the /etc/shutdown.allow file but all this does is determine who can kill the machine. The minimal solution would be something similar to a Microsoft Windows system, where a request to shutdown brings up a warning window that there are users connected to the system, but I am not sure how to achieve this on a Linux system. Ideally I would like to prevent shutdown of a system with active NFS shares altogether, or at least until the user has unmounted and logged off the network."

11 of 66 comments (clear)

  1. Can't do it by djmitche · · Score: 4, Insightful

    NFS is stateless from the server's perspective. This is done so that the server doesn't have to track the state of a whole fleet of clients (and so that the server can pick up where it left off when it crashes and restarts).

    So the server, by design, has no notion of the number / names of users connected to it.

    The best you could do would probably be to monitor NFS traffic, and present a dialog on shutdown if there has been any traffic in the last 5 minutes or so.

  2. But... by Trbmxfz · · Score: 3, Informative

    Not quite an answer to the article's question, but...

    Theoretically, once the NFS server has crashed, shouldn't all clients simply freeze until the server is back? On all systems I used, this was the observed behavior, and it is quite useful actually: it seems to avoid all data loss problems (under conditions). When the NFS gets reachable, all running program go on executing as if nothing had happened.

    A solution to the original problem, though, would be: tell all user that the NFS machine is to be powered on constantly.

  3. You've got something wrong... by Whip · · Score: 4, Informative

    If your NFS server rebooting, shutting down, or crashing causes any problem but temporarilly 'hung' clients, you have something wrong.

    NFS is explicitly designed to be stateless, precisely to allow it to function across server reboots, crashes, and other fun. If your clients are crashing, or getting back corrupted data, something is screwed up somewhere.

    And, by the way, if you're getting corrupted data on a server crash, and the server is linux, you just had an object lesson on why it's bad that linux NFS defaults to async writes. :)

  4. Try rwall or similar by ctr2sprt · · Score: 3, Informative
    There's a network-able version of wall that uses RPC (I think). It's not a foolproof solution, since it won't work if your users are logged in without an open terminal window, but it's a help. I'm sure it's terrifically insecure, but since you're running NFS you're already insecure (and so hopefully have a firewall).

    If that isn't good enough for you, there are a couple other possibilities. You could probably cobble together an utterly trivial Python (or Perl or whatever) script on your client machines, then have the server invoke it via ssh when a shutdown starts. If you aren't a programmer at all, you could try firing off an email to the client machines. As long as you have a periodic mail-checker going, it would alert you to the arrival of a new message. (Since you'd be able to use the local spool, you could have it check every 15 seconds.)

  5. use correct mount options by stevef · · Score: 5, Interesting

    If you use the correct mount options you should not have to worry about corruption when the nfs server goes away.

    The options you want (for filesystems mounted rw) are:

    rw,hard,nointr...

    A lot of people don't like these options because it means that the clients will hang until the server returns, but it is THE RIGHT THING TO DO if you are mounting important data rw. If you can't stand for your clients to hang, maybe replace 'nointr' with 'intr', but you've been warned.

    Steve

  6. maybe... by Froze · · Score: 3, Interesting

    use lsof to monitor tcp/udp/rcp sockets that are open on the host and pointing at the file space that nfs is serving.

    Then write a wrapper around each of halt, shutdown, and reboot to check the open ports and fail if they are active.

    Seems fairly hackish, but... whaddya expect from /.?

    --
    -- The morphemes of your disquisition are ascertainable, but they have eschewed an ambit of transpicuous exposition.
  7. NFS Locking Service might work? by stefanlasiewski · · Score: 4, Interesting

    I can't remember the details on this, but would the NCF Locking Services work for you?

    NFS input/output is stateless, but I believe the locking mechanism is stateful.

    When clients are accessing a file, a lock is established. When the client is done with the file, the lock is removed. You can see who has what resource locked with a utility (I forget which, but fcntl() and lockf() come to mind).

    In a shutdown script, look for locks, and refuse to procede until the locks are cleared.

    --
    "Can of worms? The can is open... the worms are everywhere."
  8. maybe i'm just stupid.. by gl4ss · · Score: 3, Insightful

    but wouldn't one key things to consider when building such a system be that a) it's down as little as possible and b) when it goes down it's well known beforehand(and the users can be told in advance that it will go down at time x and they're fsced if they don't get out before it).

    look, what point would there be in initiating the shutdown if you didn't know when the users will get out anyways? it could take hours/days before it would actually boot, and if that doesn't matter(waiting _hours_) then why would you be booting it in the first place? just out of habit?

    anyways.. it sounds a lot like you should be fixing on why you have to be booting it instead of how the booting occurs.. so that you wouldn't need to be booting it.

    --
    world was created 5 seconds before this post as it is.
  9. Re:Mod parent up by HalfFlat · · Score: 3, Interesting

    Back when I was administering a mixed Unix network, we used to say the two NFS mount options were 'hard' and 'corrupt'.

    I believe that it is theoretically possible to write software that can survive a soft mounted filesystem disappearing from under it, but no one ever does. How often do people check the return value from write()? And in memory mapped io land, it would be nasty.

  10. Yeah, but the client thinks it's stateful by bill_mcgonigle · · Score: 4, Informative

    put a program on each client machine, call it nfsmounts and it would go a little something like this:

    mount | grep $1 | wc -l

    then write a wrapper on the server that does

    foreach client (client list)
    mounted = ssh client nfsounts `hostname`
    ok = false if (mounted)
    end

    you can hook that into your shutdown script, and abort if there are any clients who think they have a mounted drive.

    of course, read the other suggestions about mount options. Noone's mentioned sync yet, but don't mount your shares async even though the performance is so much better or you'll loose data.

    --
    My God, it's Full of Source!
    OUTSIDE_IP=$(dig +short my.ip @outsideip.net)
  11. Re:Implementation by Glonoinha · · Score: 4, Insightful

    Even better thought, he could decide that there actually is a distinction between server duty and workstation duty and decide which this particular machine is going to pull. If he needs the machine to run as a workstation, quit trying to use an unstable environment as a server. If the files and stability of the system are of any importance whatsoever then it is a server, treat it as such and buy another computer to use as a workstation (they are dirt cheap now.) Pretty simple.

    Want to see your uptime and stability rise incredibly on the server? Put it in the closet on a UPS and once it is running turn off the monitor, unplug the keyboard, and tape a piece of cardboard over the power switch so it doesn't get turned off by accident. Where the machine used to sit put a cheap replacement computer to use as a workstation - even new entry level boxes are starting at under $500 fully loaded (a little wimpy, but including all the necessary parts including a monitor) and used hardware has gotten insanely cheap (ie $200 for a full machine that is a generation or two old, PIII .5 to 1GHz range with a CRT.)

    That said, I am going to read every post in this thread to get a better understanding of how to do this - now you have my interest up.

    --
    Glonoinha the MebiByte Slayer