Slashdot Mirror


Implementing a Load-Balanced Webserver?

Amoeba Protozoa asks: "How do I implement a load-balenced, layer 4 switching web-server? Would it be possible to mix O/Ss? Besides your incoming bandwidth, where do the bottlenecks occur? I would prefer to use Apache, Linux or BSD, and be able to utilize mod_perl or PHP to access a shared MySQL database. I would like to make this setup as scalable as possible."

18 comments

  1. Define your terms by Clover_Kicker · · Score: 1

    There's all sorts of load balancing.

    You can have 2 or more machines (with different IP addrs, naturally) using round robin DNS to answer to the same name.

    You can have seperate boxen for HTML, graphics, and the database. Depending on what you're trying to accomplish, you may be able to split the db among 2 or more boxes.

    If you've got (for example) one machine as HTML and a different one for the db, it really doesn't matter what OS their running.

    What do you mean by layer 4 switching? The last time I saw that buzzword was in brochures for networking h/w, i.e. glorified ethernet switches. Save yourself a few bucks and just stick everything on 100BaseT. If you can afford a network connection that makes 100BaseT a bottleneck, then you can afford to hire someone to do all this for you.

    Oh, and as to bottlenecks... Take a look at some of the tutorials about reducing the storage requirements of your GIFs. It's amazingly easy to shrink your GIFs, and that's probably the cheapest way to optimize a web site.

    Yeah, this is vague, but your question is a bit vague too. Don't take my word for it, set it up and test the hell out of it.

  2. Unisys witch hunt by chargen · · Score: 0

    Of course if you want to use GIFs you'll have to pay the lovely Unisys $5000 license fee. Try PNG format for your images. It's just awesome.

    1. Re:Unisys witch hunt by Zurk · · Score: 2

      not if you use GIFs not encoded with the LZW algorithm. of course theyre bigger. but not all browsers support PNG. and JPG is still an option.

    2. Re:Unisys witch hunt by Betcour · · Score: 1

      PNG is fine, Internet Explorer support it since version 4 and Netscape since 4.04. This accounts for over 95% of browsers around there. Amongst those 5% are folks who use Lynx, so they don't care. Screw the others, they have to upgrade - period.

      And anyway most HTML is made for 4.x generation browsers, and not tested anymore on older browsers. So supporting those older browser isn't even a modern issue.

    3. Re:Unisys witch hunt by spinkham · · Score: 1

      Except all browsers seem to have trouble with png and embed tags..
      Before it was just IE and Opera, but as of the 4.7 redhat package anyway, netscape also handles them incorectly.. Only Mozilla does this right...
      Check out this page to test it:
      http://www.w3.org/Graphics/PNG/Inline-embed.html
      I get this message:
      /home/spinkham/.netscape/cache/1c/cache3805FF3c2 454B0A.png: unknown or unsupported image type.

      On refresh, png and object tags don't work correctly either...
      Only with img tags do they work right.. Ugh...
      I thought as version numbers increase, things were supposed to get better?
      Nah, shovelware has done away with that concept. Now software gets worse with each revision.
      (OK, I'll get off my soapbox now...)

      --
      Blessed are the pessimists, for they have made backups.
    4. Re:Unisys witch hunt by treke · · Score: 1

      Actually, you don't need to pay the license fee if you use tools that are licensed to create the images.
      treke

    5. Re:Unisys witch hunt by Betcour · · Score: 1

      Bah, I always use PNG in IMG tags, this always works fine, and it is the "normal" way to use PNG anyway (embed and object tags are not really good things...). The things is that you can just replace all your GIF with PNG and keep using the same IMG tag, and over 95% percent of users won't even notice the change (except for speed boost, PNG being 20-30% smaller)

  3. NAT with Cisco Local director or ... Linux by Anonymous Coward · · Score: 1

    Network Address Translation is a good way to build a Web cluster, there is several benefits with this architecture : - Your can add and remove web servers from the cluster at any time. - The traffic is evenly distributed to each server (that is not true with DNS round robin because the clients caches the addresses localy). But the bottleneck is on the device(s) that do the address translation.... http://www.csn.tu-chemnitz.de/~mha/linux-ip-nat/di plom/node4.html#SECTION00043100000000000 000 You could do a reverse proxy with SQUID too....

    1. Re:NAT with Cisco Local director or ... Linux by Anonymous Coward · · Score: 0

      This guy is right, using NAT is probably your best bet if you don't want to spend any money. There are solutions out there like BIG/IP2 (I think the company is f5 labs) that does it for you. I used to work at a company that did about 50TB/month in web serving and we used BIG/IP2. Once you get the architecture down and understand how the thing works you should find great success with it. I think it was expensive though. The most difficult challenge you are going to have in developing a "load-balanced" or "shared-nothing" clustered web solution is managing session state. If you are capable of cacheing your entire session state on the client, then go for it, that will give you the greatest amount of scalability and flexibility. It's a difficult challenge to manage everything in a way that will provide you with the greatest amount of scalability and availability. Just remember to avoid machine affinities. If each time a request comes in from the client machine and it gets routed to a different server, and thus you cannot guarantee that each time you'll get the same machine, you have to make sure that you didn't cache any session-related information on the machine that handled the first request, or else you won't be able to get to it. Sam Wilson swilson@bsd4us.org

  4. Cluebat by colonel · · Score: 0

    Ya know, it would help if people did simple google
    searches before posting to askSlashdot. But I guess that's asking a little much.

    http://www.linuxvirtualserver.org

    RedHat also has Piranha, but that's (IMO) a cheap
    hack made to meet a deadline.

    1. Re:Cluebat by Zurk · · Score: 1

      or looked on freshmeat.net..there were at least 3 packages which do this out there. One of em was the LinuxHA project (of course). BTW, i do some redundancy w/o using this stuff using rsync over ssh and a backup "mirror" server. rsync -avu -e ssh /home remoteserver.xx.yy:/mirror/ >syncem.out cron runs as follows : 50 2 * * * /root/syncem i.e. at 2:50 am everyday both are in sync. This is not load balancing of course but a simple mirroring method with redundancy. just for everyones info.

  5. mod_backhand by your+jesus · · Score: 2


    If you like Apache, check out mod_backhand. It is a module load-balancer that is under development (but works well now) over at The Center for Networks and Distributed Systems at Johns Hopkins.

    It is a module that incurs almost NO overhead. You can mark directories or locations with Load Balancing policies and BOOM. That is it. It communicates with other Apache servers via multicast and handles the rest. You can even plug in your own decision making algorithms. It is super simple to load balance cgi-scripts to some machines, mod_perl database script to another set and images based on a completely different policy. Or just use our default ;)

    It curently runs under Linux and Solaris, but the next release will support BSDI as well.

    It is a software solution that can be combined with any hardware solution you choose (if you need that too). You can't loose with this. The install process and set up time combined is very minimal.

    Of course, I am a little biased ;)

    -- Theo Schlossnagle

  6. Bottlenecks by Anonymous Coward · · Score: 1

    Having server redundancy and load-balancing is nice and there are more than just a few options nowadays (free even) that let you do this.

    But to be honest, the Internet is always the real bottleneck. Once you try and provision your first DS3, you will figure this out.

    If your primary concern is scaleability, you need to find someone who understands network design issues, including but not limited to:

    Networking protocols like BGP4 and STP (Spanning Tree is very important), VLANs/Trunking/EtherChannel/ISL, all switch and router software/hardware specifications/bugs, Provisioning of Telco circuits, Public and Private Peering, etc.

    If you don't understand how the Internet works and how to build scaleable networks (and this takes years of experience), then Knowledge is your biggest botteleneck. I believe someone had stated above that beyond 100BaseT, you need to hire someone that knows what they are doing. Well, that sounds about right.

  7. Eddie.... by H-Monk · · Score: 1

    Eddieware is somewhat nifty.

    At least worth a look, anyway.

    --
  8. LinuxDirector is the way to go by thule · · Score: 1

    http://www.linux-vs.org. It uses IBM's technique called direct routing. They say it'll scale to over 100 (or 150??) computers in a cluster. It can also load balance with tunnels for remote systems.

  9. Meaning of "Scalable" by Anonymous Coward · · Score: 0

    Scalable always means different things to different people. If you mean that you want to saturate your T1 with static web pages before your server goes down, all you really need is a standard Linux box.

    If you mean "I want to run e-bay", then you should check out an application server. My personal favorite is Dynamo [disclaimer: I work for them] but there are cheaper solutions if you don't have the cash.

  10. Linux Virtual Server by fznck · · Score: 1

    You go to this url: Linux Virtual Server Project and download the tools neccessary to build a load balancer based on Linux. This replicates the functionality of the Cisco LocalDirector, which is your other option (pricey $8k). People will tell you that the LocalDirector blows, and compared to the $18,000 load balancers, it does, but for what it is designed to do, it does it's job well. Just as the Linux VS project.

    Database servers are often the bottleneck. I think people are nuts for load balancing their web servers when it is the database servers that are most often bogged down. A 486 running PHP can saturate multiple T1s if the database back end is fast enough (I've done it).

  11. Layer 4 switching by gbr · · Score: 1

    I think that layer 4 switching is the way to go. Automatic load balancing between multiple servers. Server failover support (if a server goes down, it no longer gets requests).

    We will be evaluating some shortly... we'll see how it goes.