Slashdot Mirror


Real World Webserver Price vs. Performance Figures?

Borgoth asks: "At my company we just broke 10 million pageviews per day. We use 5 2-processor 1U off-the-shelf Intel boxes running Apache, Linux, mod_perl, and MySQL. This averages out to about 2 million pageviews per day per server (about 20 million hits/server, including images). Most of our pages have some dynamism using mod_include SSIs, and maybe one pageview in five directly results in a db query. We think we should be pretty happy that we're doing so much with so little, but we don't really have any idea how much horsepower other sites are using in their server farms. So, what sort of webfarms do Slashdot readers maintain, and how does their performance compare?"

6 of 56 comments (clear)

  1. not many comparisons by rumpledstiltskin · · Score: 2, Insightful

    You probably won't find a whole lot of comparable situations, even on slashdot, except maybe slashdot itself.

  2. Hard to say by linuxwrangler · · Score: 4, Insightful

    It's a bit like saying "we just shipped 5000 thingies last month using 3 vehicles". Um, 5000 beanie babies or 5000 tractor engines?
    Was the vehicle a rowboat or a train?"

    Every site is different. I don't really care that the servers are 1U at the expense of telling us things like how large the database is and is it mostly cached reads or read-write activity? How big is the pipe? What is the CPU speed and RAM size? What is the speed and type of disk? How many bytes are transferred?

    Incidentally, a much more important number is peak capacity, ie. what is your 5 minute peak load? Whatever you can reasonably handle for 5-10 minutes you can probably handle constantly but a supposedly high-volume site can melt down when the site gets flashed up on the morning news or Slashdot.

    --

    ~~~~~~~
    "You are not remembered for doing what is expected of you." - Atul Chitnis
  3. Sorry, no FAQ for that! by jgardn · · Score: 4, Insightful

    You are approaching the point where the information you'll get from others won't apply to you, because you are pioneering new territory with your company's technology.

    You have a website that has its needs. I can't imagine what kind of application you are using, how much memory it needs, whether it is processor intensive or disk intensive, or both. Depending on how your website works, there are a variety of solutions available. One solution to one problem might actually cause more problems for you if applied inappropriately.

    It might make a lot of sense to consolidate the database onto an advanced server -- with 2 procs, RAID SCSI drives, and a fair amount of memory. It might make a lot of sense to get cheaper boxes with more memory and only one processor to run the web servers. Perhaps you can mount them all off of one giant NFS file server, and have the data that the web servers need held in a cache on the web server. It might make a lot of sense to go talk to IBM and Sun and see what they have to offer as well. It might also make a lot of sense to redesign the way your web application works to reduce the load.

    But no one can tell you the right way to do it, because your situation is unique. No one can even give you a good estimate of cost. Your best bet if you are truly lost is to hire someone to analyze your code, your servers, and your needs, and come up with a plan. Those guys cost a bit of money, and finding a good one is near impossible. You're better off at studying up on what your website really needs and experimenting with possible solutions.

    This is where you start to realize why web people can earn up to 6 digits. We don't just design web sites or program applications. We have to make sure they scale as well.

    --
    The radical sect of Islam would either see you dead or "reverted" to Islam.
  4. Re:Using mod_gzip? by TwistedKestrel · · Score: 2, Insightful

    Great advice. Nowhere in his post does he mention bandwidth concerns. So if he were to install mod_gzip, he would reduce the capacity of his servers, not increase it. Mod_gzip isn't the answer to everything.

  5. DB or not DB? by fm6 · · Score: 3, Insightful
    Actually, a lot of Slashdot content is, for all practical purposes, static html. Notice the message that appears when you post?

    You neglected to mention what DBMS you use. Or is it a given nowadays that everybody uses MySQL?

    Which is my cue for my usual anti-MySQL flame. Except that it's old, I'm tired of doing it, you've all heard it. Still, I'd like to see some serious benchmarks comparing MySQL with PostgreSQL, Firebird, and Berkeley DB. With attention to realistic web-style queries, scalability and (except for Berkeley DB, of course) complex queries.

  6. Consider the whole picture.. by elemur · · Score: 3, Insightful

    Think about your network, load balancing, and other sorts of issues.

    For example, I had a site that I ran for a while that was fairly poorly built from an application perspective. However, the client had prepped a flash load (ie: a bursty, concentrated load) for a specific time period.. and I had about a month to prepare. The problem was that we couldn't rewrite the apps part of the site to ease the congestion, nor could we rewrite some apps to be distributed to multiple servers. (They stored state on the server..)

    So, I brought in a Foundry ServerIron, and used the URL switching to map all static files/items to a pair of Ultra 5 workstations. These had a bunch of memory and had iPlanet Enterprise Server configured with very agressive caching parameters. For the dynamic content, I also increased any caching parameters available.

    (This is high level, but you get the idea. Basically, serve as much out of memory as possible.. other tuning issues.. turn off name resolution obviously.. make sure you aren't I/O bound.. or network bound for that matter.)

    The day came around and we served 5 or 6 million hits in two hours or so.. the average load on the servers was around 0.1. In fact, even on the servers with the static content getting lots of hits, there was only really disk activity when access logs were flushed to disk (Every 30 seconds)..

    So, don't just think about servers.. consider all options when trying to balance and handle your load.