Building a Better Webserver
msolnik writes: "The guys over at Aces' Hardware have put up a new article going over the basics, and not-so-basics, of building a new server. This is a very informative, I think everyone should devote 5 minutes and a can of Dr Pepper to this article."
Microsoft has written several white papers of this sort already. Of course, they're Microsoft, so that means I can kiss my +2 bonus goodbye. Seriously, though.
If you fall off a building, go real limp, because maybe you'll look like a dummy and people will be like hey, free dummy
Once upon a time, we had 1 web server that did everything, so it needed to be able to do everything. Now everytime we do something new we toss out a new webserver (or 2 or 10 of 'em). And they all basically need to do one thing (webmail, portal, whatnot) and do it well and that's it.
So we've got a whole bunch of Apache servers which a bucket load of apache processes who basically spend all day doing little more than exec'ing the same CGI over and over (and copying the data around a couple of extra times).
I'm pretty much now convinced that would my next step is going to be to franken-meld my cgi with something like mini-httpd so it is a single, persistant, app.
I'm certainly not redoing the whole thing in Java though! :)
Shut up, be happy. The conveniences you demanded are now mandatory. -- Jello Biafra
The SPARCstation 20 was one heck of a great machine back in the day, especially for its size (a low profile pizzabox). The design was a lot like it's older brother (the SPARCstation 10 from 1992)... that is, two MBUS slots (for up to 4 CPUs) and 4 SBUS slots (Sun Expansion cards, 25 MHz x 64 bit = ~ 200 MB/sec each, but 400 MB/sec bus total).
I remember using a Sun evaulation model at Rice many years ago... the machine had two 150 MHz HyperSPARC processors (though 4 were avilable for more $$), a wide SCSI + fast ethernet card, two gfx cards for two monitors, and some sort of strange high speed serial card (for some oddball scanner, I think). Not to mention 512 MB of ram, in 1994! The machine was a pretty decent powerhouse and sooo small! I sort of wish the concept would have caught on, given how large modern workstations are in comparison. Heck, back then an SBUS card was about 1/3 the size of a modern 7" PCI card.
Then there's the other end of the spectrum... one department bought a Silicon Graphics Indigo2 Extrme in 1993. The gfx cardset was three full size GIO-64 cards (64 bit @ 100 MHz = about 800 MB/sec), one of which had 8 dedicated ASICs for doing geometry alone. 384 MB of RAM on that beast. Pretty wild stuff for the desktop.
Ahh, technology. I love you!
There are more factors than just CPU and Bandwidth... like what's between the two. A new coworker recently told me of his major learning experiences in the mid 1990s running several popular news websites durring the beginning of the web boom. One of the more popular sites he ran originally had a T1 routed through a Cisco 4000 router. Things worked great until he had an additional, load balancing T1 added for added thruput and redundancy. Things didn't feel much faster, in fact, they were almost slower. After much investigation he learned that the router didn't have enough RAM or CPU to handle the packet shuffling that intelligent multihoming routing requires. A similar instance happined with a friend's company when they tried to run a T3 through their existing router. While the old cisco had enough cpu and ram in theory, its switching hardware and thruput couldn't handle the full number of packets the T3 was providing thru the shiny new HSSI high speed serial card.
Now, I realize modern hardware (Cisco 3660 and 7x00 series, and pretty much any Juniper) can route several T3s (at 45mbps each direction) worth of data, but older routers and minimally configed routers do exist.
There are MANY bottlenecks in hosting a website. Server daemon, CPU, router, routing and filtering methods, latency and hops between server and internet backbones, overall bandwidth thruput, and much more.
It's not as simple as "lame server, overloaded CPU, should have installed distro version x+1".
If you can't live without Apache, there's always mod_bandwidth.
Not quite as elegant a solution, but it's nice for preventing your web server from taking all of your bandwidth (if, say, you run it off your cable modem, and wish to continue gaming...).
quote:
::same:: address space, and only get new memory when they try to touch a page that is write only (ie they can run and run and run, but once they try to access their memory they get new memory space with the contents copied). It saves time and memory.
:(
This means an Apache web server using keepalives will need to have more child processes running than connections. Depending upon the configuration and the amount of traffic, this can result in a process pool that is significantly larger than the total number of concurrent connections. In fact, many large sites even go so far as to disable keepalives on Apache simply because all the blocked processes consume too much memory.
::end quote::
lets see, anyone here hear of COW (copy on write) Linux uses this idea to save time on fork'd child processes, they get the
The only setback is when a process fork's a child, its current time slice is cut in half with half given to the child, so the main proc will run aground if to many requests come in and the server has more processes to worry about.
-ShadoeLord
this is my sig, there are many like it, but this one is mine.