Slashdot Mirror


Building a Scaleable Apache Site?

bobm writes "I'm looking for feedback on any experience building a scaleable site. This would be a database driven site, not just a bunch of static pages. I've been looking for pointers to what other people have learned (either the easy way or hard way). I would like to keep it Apache based and am looking for feedback on the max # of children processes that you've been able to run, etc. Hardware-wise, I'm looking at using quad Xeons or even Sun E10K systems. I would like to stay non-clustered if possible."

5 of 60 comments (clear)

  1. Re:Persistent Connections Are Your Friend - MAYBE by Longstaff · · Score: 4, Interesting

    Combined with a proper connection pool, they can really save your butt.

    However, persistent connections may be too much of a burden for an overworked db server. If you're using PHP/MySQL for example, mysql_pconnect may not be the way to go if you have a few front end servers hitting the database. It seems that the PHP connection pooling limit is per process. If you have 100 Apache processes w/ a 10 connection limit per and 10 web servers, that's a max of 10,000 db connections!!!

    One idea might be an intermediate "connection broker" on a per server basis. We use something similar to this.

    Apache's fork() model is great for stability, but it really hinders interprocess resource sharing. We're mostly Java based here, which allows us to use beans and such. Does mod_perl allow for resource sharing between processes?

  2. Session Management by JMandingo · · Score: 2, Interesting

    Assuming you are using multiple web servers, and that your app is complex enough to require a session data management scheme (rather than just passing vars from page to page in the query strings), I recommend using cookies for session data. Naturally this only applies IF you don't mind requiring your clients have cookies enabled, IF you don't need to store anything more complex than strings, and IF the total amount of data you need to store is small.

    Another option is to store session data the your top level frame on the client, but this can be messy and hard to debug. Storing session in your database is elegant and easy to debug but can increase the hits on your database to a prohibitive degree. Adding database bandwidth in the future is difficult and expensive. Adding web servers to your system is comparatively cheap and easy.

    --
    Vonnegut was right: Of all the words of mice and men, the saddest are, "It might have been."
  3. Re:Look in the right place by bobm · · Score: 4, Interesting
    Thanks for the info, I left the specifics out since I'm looking for generic feedback (for the learning).

    The site will be mostly serving dynamic content with the average page being about 60-120k of code and around 10k of images. And yes, that's a lot of code but the site is serving up reports and whatnots. There are small pages between reports and the usual login, etc screens.

    The real purpose of the question was to see how different tuning is being used in the real world, as the web has matured there has to be some interesting information on keeping the systems up 24/7, etc.

    For example we're looking into a replicated database with just the important info (and I know that important is a real fuzzy term) for periods when we need to bring the primary database down.

    what would be interesting is the proactive analysis (when do you add more hardware, etc) that is done on a live running system.

    thanks

  4. Re:You need to provide way more info by bobm · · Score: 4, Interesting
    Database: Informix on EDS served from an E10K.


    Dynamic: currently mod_perl but open to something faster (if there is a proven faster technology).


    Apache: current 1.3.x move to 2.0.x when it's ready for prime time.

    OS/Hardware: open, currently Solaris/Sun, open to quad Xeon/Linux if it has the performance.


    The reason for asking about a single vs multiple machines is that I wanted to get a handle on what one box could do as opposed to the gut reaction to just keep adding servers.


    Although I'm not expecting magic I didn't want to get too specific because I'm interested in feedback from across the board, for example how does Orbitz or Yahoo or *New York Times* maintain uptime? I haven't found anywhere that discusses places like that.

  5. Great article on Scaling your DB by RevDigger · · Score: 2, Interesting
    This is a great article on scaling a website really fast. I found their techniques for scaling their database especially interesting.
    http://www.webtechniques.com/archives/2001/05/hong / /A>
    It's about the guys who built amihotornot.

    - H