Real World Webserver Price vs. Performance Figures?
Borgoth asks: "At my company we just broke 10 million pageviews per day. We use 5 2-processor 1U off-the-shelf Intel boxes running Apache, Linux, mod_perl, and MySQL. This averages out to about 2 million pageviews per day per server (about 20 million hits/server, including images). Most of our pages have some dynamism using mod_include SSIs, and maybe one pageview in five directly results in a db query. We think we should be pretty happy that we're doing so much with so little, but we don't really have any idea how much horsepower other sites are using in their server farms. So, what sort of webfarms do Slashdot readers maintain, and how does their performance compare?"
To a purely static server, like thttpd. Then you can focus the dynamic servers on serving purely dynamic stuff, and optimize accordingly. Also, MySQL 4's query cache is a great thing, so if you're not using it yet, look into it.
My best advice is: use vmstat. vmstat 10 will give you readings on all the stuff you need to know: mem, disk and cpu usage.
If at first you don't succeed, skydiving is not for you
Too many people seem to concentrate on processing power and hardware while neglecting the software side of things.
Using a web server which pre-forks (example-- Apache 1.3x), is probably the best way to dramatically reduce performance and scalability in most situations. The sheer number of processes under high load makes most schedulers crap themselves in most situations.
Multithreadedness, an example is Apache 2.x, can greatly improve performance and scalability as can single process, single threaded multiplexing non-blocking IO based web servers such as Thttpd, BOA or Zeus.
Once one has selected a server which works effeciently for them given their content, fine tuned their OS, then one can move towards actual processing power and system throughput.
The expense of gzipping is easily recovered by the fact that it takes 2x or 3x less time to send the data out. That is the same precious time that httpd spends to send its output, while eating up memory and CPU.