Slashdot Mirror


Tips for Increasing Server Availability?

uptime asks: "I've got a friend that needs some help with his web server availability. On two separate occasions, his server has had a problem that caused it to be unavailable for a period of time. One was early on and was probably preventable, but this latest one was due to two drives failing simultaneously in a RAID5 array. As a web business moves from a small site to a fairly busy one, availability and reliability becomes not only more important, but more difficult to accomplish it seems. Hardware gets bigger, services get more expensive, and options seem to multiply. Where could one find material on recommended strategies for increasing server availability? Anything related to equipment, configurations, software, or techniques would be appreciated."

5 of 74 comments (clear)

  1. Hosting by hatch815 · · Score: 5, Insightful

    if you are moving to a level that you need uptime, but cant dedicate more resources to overseeing it - you may want to considering a hosted solution. They host, monitor, upgrade, do checkups (YMMV with whom you choose)

    If that isnt something you want to venture down, then start planning outages for fsck, upgrade, and standard checkups. There are alos plugins for NAGIOS that will check different RAID controller status, server response, and server load

  2. Um, details? by afabbro · · Score: 1, Insightful
    You have a budget of $1 million?
    You are hosting this on a 56K dial-up in your root cellar?
    Your apps need to run on Microsoft Windows or HP-UX or...?
    You've got a SAN or local disk or...?
    You're using home-built white-box x86s or Sun E15000s or...?
    You have sysadmin talent on hand? You're outsourced to IBM global services?

    Who vets these silly questions? Oh, I forgot - the "Editors".

    --
    Advice: on VPS providers
  3. Hire a Professional? by marcus · · Score: 3, Insightful

    That is all...

    --
    Good judgement comes from experience, and experience comes from bad judgement.
    - W. Wriston, former Citibank CEO
  4. Probability of simultaneous two disk failure by metoc · · Score: 2, Insightful

    are extremely low given the MTBF of modern drives. You have a better chance of a power supply or fan failure.

    On that basis I am going to make some wild assed guesses that are more probable given the little information we have.

    1) the drives were consumer models from the same production lot,
    2) the death of the first drive was not immediately noticed,
    3) compatible replacement drives are not easy to come by (no hot spare),
    3) the second drive died before the first one was replaced,
    4) the server did not have hot swap drive carriers
    5) someone tried to replace the dead drive in the running chassis

    If you don't like my guesses provide your own

    1. Re:Probability of simultaneous two disk failure by mangu · · Score: 4, Insightful
      If you don't like my guesses provide your own


      6) The drives are overheating. This happened to my two Seagate 200Gb drives. Had to mount them in a heatsink, the normal bay does not provide adequate cooling for Seagate's 7200 rpm drives.