Slashdot Mirror


DNS based Website Failover Solutions?

Chase asks: "I run a couple of websites(including for my work). I'd like to have a backup web server that people would hit when my server goes down. My primary host is on my companies T1 line and even though I've had my server die once the most common reason for my sites to be offline is that our T1 goes down. I've looked at the High-Availability Linux Project but it seems that almost everything there is for failover using ip takeover which isn't an option if my network link dies and my backup server is on a different network. ZoneEdit seems to offer what I'm looking for but I'm wanting a do it myself solution. The only software I've found is Eddie and it seems to have stopped development around 2000. I know DNS based failover doesn't give 100% uptime but with a low cache time and decent monitoring it seems like it's the best solution for having my backup server at a differnt location and on a differnt network. Anyone know of a good solution? (Using Linux and/or Solaris hosts)"

12 of 39 comments (clear)

  1. Depends whether you want to pay for it . . . by unixbob · · Score: 4, Informative

    If I understand you correctly you you are looking for a F/OSS project to do what you are after.

    However if you do actaully have a budget to spend have a look at the 3DNS product from F5 Networks. it does the failover you describe and although it works better if it is intereacting with F5's server load balancing product, it can still monitor and react to standard web servers becoming unavailable.

    --
    The Romans didn't find algebra very challenging, because X was always 10
  2. Re:Dyndns by anicklin · · Score: 2, Informative

    dyndns is pretty good in that with a custom domain, you can set an 'offline' redirect URI. However, this has to be done manually with an internet connection - kind of a problem if the dedicated public connection is unavailable, although you could always revert to some sort of dialup to get onto their web site and update it.

    They will let you configure custom TTL values on A (host) records. I set mine to 5 minutes and it works just fine.

    There are some automated engines out there which will update the dyndns service automatically, but I have not seen any which will automatically set the unavailable URI if the primary internet connection isn't available.

    dyndns is more oriented at people who want to host but their address changes frequently, whether for black-hat, white-hat or ISP DHCP reasons. However, while reliability has never been a problem with their service, it may not suit the needs of a more commercial customer.

    Just my two cents as a happy user.

  3. uhhhh by nocomment · · Score: 2, Informative

    If your T1 is down tht often I'd change providers. My T1 has been 'slow' once in the past year with 1 outage that lasted for about an hour when we first installed it.

    --
    /* oops I accidentally made a comment, sorry */
    /* http://allyourbasearebelongto.us */
    1. Re:uhhhh by nocomment · · Score: 2, Informative

      If you need the QoS, but not necessarily a full T1 maybe you should look at SDSL. With ADSL the phone company owns the switching equipment and can turn it off/move/upgrade/whatever whenever they want. But with SDSL the provider (ie speakeasy, covad(if covad does sdsl)) owns the switching equipment and will skip over it when doing their moves/upgrades/whatever. Speakeasy has a QoS guarantee. I still feel safer with a T1 though :-)

      backhoes are easy to fix, I remember when I worked at Mindspring (pre-Earthlink) there was major outage (a hurricane I think) in NY that not only broke the T1 (there was exposed fiber) but it was also under 30' of water. It took 7 days to drain the water before the cables could be repaired.

      --
      /* oops I accidentally made a comment, sorry */
      /* http://allyourbasearebelongto.us */
  4. A few ways.. by ADRA · · Score: 4, Informative

    1. Use colocation/Web hosting as the primary site. Their uptimes are usually very strong.

    2. You will need a second line. Mandatory. If you really want insane uptime, you'll need dynamic routes ala BGP from both ISP's. If you don't need that much, you could maybe work with an automated probe-and-dnsupdate script which can run outside the network. It would switch the primary DNS to and from the backup IP address which is on the isolated network.

    3. Have an equalized DNS entry for both IP addresses. It gives the client a 50% chance of connecting once its dead, but its better than nothing.

    4. Tell the site visitors to connect to www1.mysite.com if they're having troubles reaching your site and have www1 pointing to your backup IP. Make sure your DNS servers are network redudant as well, or the whole excersize is pretty pointless.

    --
    Bye!
  5. You could always use IPv4 Anycasting. by Mordant · · Score: 2, Informative

    More information here.

  6. RFC 2136 + Net::DNS + your monitoring software by embobo · · Score: 3, Informative

    Ignoring the fact that DNS wasn't designed to handle this (setting your ttl to a low time (e.g., 5min) generates a good amount of useless traffic when your site is up), here is how you might do it:

    First, you need to have a monitoring system on the Internet somewhere, not through your T1 because if that goes down it won't be able to update your DNS. You have that already, I'm sure, to test your web site accessibility from the Internet. Of course, at least one of your name servers must be accessible when the T1 goes down too, so that will have to be somewhere (other than on your T1) on the Internet as well.

    On this name server enable dynamic updates. Modify your monitor system that checks availability of your site to use Net::DNS to update the IP address of your web server when the monitor fails.

    Going all open source, I'd use Net::DNS and nagios for the monitoring software, bind for the name server (which supports dynamic updates), with Linux as the OS.

    1. Re:RFC 2136 + Net::DNS + your monitoring software by byolinux · · Score: 3, Informative

      Nagios

      with Linux as the OS

      Kernel! And anyway, does the fact you're using GNU/Linux or *BSD actually make a difference to this?

    2. Re:RFC 2136 + Net::DNS + your monitoring software by FistFuck · · Score: 3, Informative

      I do it now with two shell scripts.

      The key is that I use tcpclient from DJBs ucspi-tcp package:

      http://cr.yp.to/ucspi-tcp.html

      Don't hurt yourself with BIND, either. Parsing that file is going to hurt your brain. I use grep -v to manage my data file for tinydns:

      http://cr.yp.to/djbdns.html

      Maybe I'll get around to publishing my work. A brief synopsis:

      I do a tcp connection to port 80 on my webservers with a 5 second timout. If the connection fails it pulls all IPs assoicated with that server out of my DNS. Not only does this determine if the server is up but it also determines if the server needs less load because it can't get to my request
      in time.

      There's a state file for each webserver, ie webserver.up or webserver.down. That's easy to look for later to determine if I need to change the DNS tables.

      I run the check every 60 seconds. I only have two servers so it's not too tough.

      I also check www.yahoo.com and www.google.com availability over each ISP to determine if an ISP is available. I update DNS based on the ISP conditions as well.

      I say again, try to avoid BIND if you can, I can't think of a sane way to process your zone files with shell scripting.

    3. Re:RFC 2136 + Net::DNS + your monitoring software by ptudor · · Score: 2, Informative
      ...I can't think of a sane way to process your zone files with shell scripting.
      Luckily, when moving to tinydns there is a sane way to convert your zone files with shell scripting.
  7. We tried it, and it didn't work. by Anonymous Coward · · Score: 1, Informative

    That's right, it didn't! We found that even when we set the TTL to 60 seconds, some DNS servers still cached the old name look-up for hours, if not days. One of our remote sites was using the Windows NT DNS server, and it cached out of date name look-up for 30 days! Damn Microsoft. This makes DNS-based failovers useless for most purposes.

  8. Don't use DNS failover. by Harik · · Score: 3, Informative
    more then one large company enforces a minimum TTL to cut down on outbound lookups. Notably, AOL clients keep hitting the old address up to 24 hours after the switchover. Other ISPs/firewalled companies do the same.

    Then again, if it dosn't matter to you, don't worry about it. Just do RR-DNS and manually cut out the failed IP. "most" people will get the still-working servers.