Scaling Server Performance
An anonymous reader writes "When Ace's Hardware's article Hitchhiker's Guide to the Mainframe was posted on Slashdot, they got 590,000 hits and over 250,000 page requests during one day. This kind of traffic caused only a 21% average CPU load to their Java-based web server, which is powered by a single 550MHz UltraSparc-II CPU. In their newest article, Scaling Server Performance, Ace's Hardware explains how this was possible."
Are we supposed to be impressed with a computer that can serve 8 hits and 4 pages per second?
they got 590,000 hits and over 250,000 page requests during one day. This kind of traffic caused only a 21% average CPU load ... they didn't respond to any of them.
Reliable, Great Value Hosting: $7.95/mo 2.4G/120G
Of course, it is incumbent upon all of us to rush out and try to the link to the article. And some of us to actually read it as opposed to just reading the title.
SLASHDOT THEM AGAIN!!!
When I was benchmarking web servers in *1994*, servers could handle 100,000/hr, which is only about 30/sec. You may need a T3 to handle the bandwidth, but any server that can't handle it today is misconfigured.
seeing as it took Slashdot 35 seconds to serve me up this comments.pl?op-Reply page, yes, i think we are supposed to be impressed.
Cretin - a powerful and flexible CD reencoder
Which is very funny: this is an article explaining how a web site survived the /. effect, thus trying to catch the /. readers back for a second round, and getting lots of advertising hits at the same time. If only that server could keep up. /. I saw a report about a 200Mhz (?) PC running Windows 95 and with about 30 hard disks, that also seemed to do very well under the /. effect.
Now, a while back on
Sig for sale or rent. One previous user. Inquire within.
I would be more interested in stats on a webserver that took a puke. It would be interesting to see what started the dominos falling and what ultimatly brought it down. It would be as good a learning experiance as this article is.
OTOH, my puny little SDSL connection was seriously maxed out.
Even old hardware can happily serve up hundreds of documents a second, if the pages are static.
>Garth Brooks covers this in his famous book "The Mythical Man Month" where he proves in a controled lab environment that Java under X86 runs on the order of Olog(n) slower than it does on a RISC chip like an UltraSparc.
WTF? Fred Brooks wrote this book, and I don't seem to remember RISC or UltraSparc chips, not to mention Java, in 1974. Garth Brooks is (AFAIK) a country music singer. Try again.
Sig for sale or rent. One previous user. Inquire within.
I never really thought that the problem lied with the server's hardware, but in the bandwidth to the host. Shouldn't an article be written about how to conserve bandwidth during a slashdot effect? Even older servers should be able to handle 100 requests per second. I think most FPS's are alot more taxing than that.
"The page cannot be displayed".
The lesson here is: put your money where your mouth is and you may end up eating it.
Lots of people could use this type performance. I only had a chance to use JSP on one project, a while back. Tomcat was notoriously difficult to install back then. But when it was up, the difference between JSP application server and PHP become apparent. Application servers can make quite the difference.
Just having an application scope for variables saved us a trip the the ldap server per request. PostNUKE, squirellmail, and lots of other large PHP apps could be sped up drastically if some of those features were available in the PHP engine.
Based on upvotes, Ageism is the only "-ism" Slashdotters care about and think isn't SJW
I prefer reading "The Mythical Man Month" in the original latin. Brooks' translation leaves much to be desired.
Reloading their page a couple times (2nd page of the article, not the one slashdot linked to), I'm getting occasional 503 errors, and the rest are taking a very long time to load. Usually the page comes up with some "broken" images that didn't load.
At the bottom of each page, there's a number that seems to indicate the time they believe their server spent serving the page. Usually is says something like "2 ms" or "3 ms"... That may be how long their code spent creating the html, but the real world performance I see (via a 1.5 Mbit/sec DSL line) is many seconds for the html and many more for the images, some of which never show up, and sometimes a 503 error instead of anything useful at all.
So, Brian, if you're reading this comment (which will probably be worthy of "redundant" moderation by the time I hit the Submit button)... it ain't workin' as well as you think. Maybe the next article will be an explaination of what went wrong this time, and you can try again???
PJRC: Electronic Projects, 8051 Microcontroller Tools
They've really simply discovered that dynamically generating essentially static content is a bad idea : the 'dynamic' pages they are talking of are just articles which once written stay the same, and so are serving identical pages to each user.
Using scripting with database look ups to create such pages is obviously not good - much better is to compile your data in to static pages and serve those. I have done this for my own website using XSLT to generate the html pages with consistant links and menu's etc. - but you do have to remember to re-build it after making any changes or adding new content (I use gnu make to handle the dependancies of one page upon another so it doesn't rebuild the entire site everytime.)
They've taken the alternative approach of still using a database for the requests, but then caching future requests for the same page-id's, which has the advantage of being compatible with their original dynamic generation system, but they don't mention how they handle the dependancy / cascading alterations problem if they change the content (though they could always flush the entire cache of course....).
Neither of these approaches can help you though if you have real dynamic pages where every request is unique or there are are too many possible pages for caching to be feasible (for example amazon or google).
I remember reading that original article, and yes, I was impressed at the responsiveness of the server. But before they are congratulated so much, consider this. The original story was posted on slashdot at 1AM.. so the initial spike of activity resulting from the linking being in the top few on Slashdot was directly proportional to the number of people on Slashdot at the time. As you can see from their graphs (if they're showing up for you) that traffic spiked, then continued on during the day.
This time around, the link got posted at 2PM not 1AM, and so far as I can see, they handle this flurry of hits much less gracefully than the previous ones! There are a lot more people online at 2PM than 1AM (all arguments of nocturnal nighthawks and people in other time zones aside).
Seriously ... the numbers aren't that great. I used to admin a DEC Alpha Digital Unix server running at a whopping 300Mhz and it routinely served over 1.5M hits per day along with email, authentication and accounting for over 5,000 people and we rarely if ever saw it over a 0.5 load average. This was 4 years ago.
It's not apples to apples, since we weren't serving the same set of pages (we had around 500 personal homepages, each with a varied combination of static HTML, images and CGI programs) but honestly, if the numbers in this article are supposed to be impressive, we've grown too accustomed to web server feature bloat.
It is more productive to voice thoughtful opinions (reply) than to judge (moderate) others.
Queuing approaches have proven to be much more scalable in other areas - no reason to think it wouldn't work for web servers. Check out SEDA: An Architecture for Highly Concurrent Server Applications for a working implementation in Java that outperformed Apache [insert benchmark caveat here].
More on event-driven servers that minimize data copies and context-switching here.
Some static story pages? Who cares?
It all depends if you are actually doing something of interest.
Like the comments in Slashcode, most apps go from static, to dynamic, to static caching of dynamic pages.
At DTN we served up customized portal pages to people with commodity and equity quotes, news, graphs, etc. Since they didn't have any money we had to use a load balanced Pentium Pro and a Pentium II. The app had no problem serving the load, and it was fast.
Now that I work for companies that have money, our apps run really slow. Developers get expensive machines and don't know how to optimize any more.
Ace's Hardware needs to research real servers before talking about their "scalable" servers. Their numbers are really saying that their box performs like a dog.
For those of you interested in this topic here is a few pointers and words of wisdom.
Server scalabilty and performance has three basic metrics, thruput (urls/sec), simultaneous connections, and performance while overloaded. Of course, you could add latensy but I'd argue that with the correct design latency is directly proportional to the real work you are doing, bad design insertes arbitrary waits.
I know of a HTTP Proxy by a large ISP that does user authentications & URL authorization (re: database), header manipulation, and on-the-fly text compression at 3000 urls/sec for 2000-4000 simultaneous connections and maintains that performance under load by sheding connections, all this on a dual 1GHz Intel PIII box running a Open Source OS that starts with "L". That is a maximum of 260 Million URL/day, three orders of magnitude greater performance than Ace's Hardware stats.
The simple answer to the question "How do I create a scalable fast network server?" is Event-driven GOOD & Threads BAD. Event driven network communication is two to three orders of magnitude better performing than thread/thread-pool based network communications. See Dan Kegel's C10K web page. That means you must use non-blocking IO to client sockets and databases. Once you accomplish that small feat, dynamic content just consumes CPU; with 2.8 Ghz Xeon processors you have plenty of cycles for parsing HTML markup or whatever. Threads cause cache thrashing, and context switching. While thread programmers don't see the cost in their code, just read the kernel code and you'll see how much work HAS TO BE DONE to switch threads. Event driven programming just takes some state lookups (array manipulation) and a callback (push some pointers onto the stack and jump to a function pointer).
Desgin is FAR MORE IMPORTANT than which runtime you use (execution tree, byte code, or straight assembly). I have done some very high load network programming with Perl using POE.
Python has Twisted Python
Java has the java.nio and the brilliant event/thread hybrid library SEDA by Matt Welsch.
I am also looking into the programming language Erlang which builds concurrancy and event driven programming into the language. Further, Erlang is used by some big telco manufacturers to great effect (high performance and claimed 99.9999999% nine-nines reliability on a big app).
-- I am not a fanatic, I am a true believer.
As a bunch of people have pointed out, it is unlikely that the /. effect is a matter of "crashing" servers. It is much more likely that most of the "slashdotted" sites on the front page on a given day involve a server which is doing just fine and a bandwidth pipe which is seriously about to burst.
You can saturate most any small-business-affordable pipe with a Pentium classic machine as a Web server. Or to put it another way, there's no point sticking a dual-P4-Xeon Web server with 4GB memory and a RAID-5 on a DSL line.
The computer I'm using right now (a PIII system) could run Apache very nicely in the background and would likely survive quite a hitrate without too much trouble. But if even just a few thousand people were to hit it all at once, there would be a traffic jam, some people wouldn't get served, and the ISP would probably close me down, because I'm only sitting on a 256k pipe.
STOP . AMERICA . NOW
They said that the peak load was 11 hits per second, with 4 pages being served. They also said that their CPU was 21% loaded to serve this much traffic.
/. readers to hit them all at once, or perhaps they need a bigger pipe connecting them to the Net.
This says nothing about what they can serve under ideal conditions; this is what they actually served up during an actual slashdotting. If you want to max out their server, you will need to get more
Read the article; on ApacheBench with one particular page they tested, the server tested out at five dozen pages served up per second.
I don't know about you, but I was somewhat impressed by all this. A $1000 Sun does seem to have been a wise choice for them.
steveha
lf(1): it's like ls(1) but sorts filenames by extension, tersely
I hate to do this, and get into some kind of "look at my l33t skills" type thing...but seriously, those numbers are just nothing to be impressed with. As several people have pointed out, usually the limitation on a well configured server is the bandwidth available. I have a buddy who runs a few adult sites, and I go ahead and keep his machines updated, optimized, etc, etc. On one web server alone, with simply rebuild Apache with a higher HSL and streamlining only essential services this *one* server is handling an average of 16,000,000 hits per day. (avg approx. 16,000,000 hits, 5,000,000 pageviews, 450,000 unique visitors per day). In fact, only last month did we set up a separate database server in anticipation of him getting even more traffic (I wanted to separate the web server from the db server esp. if we were gonna move to load balancing)...even still the cpu load was consistently low and the site was/is serving dynamically generated content (php) and is all driven by a mysql content management system. I have yet to even max out the usage of the server and do some ulimit type stuff or hard adjustments via kernel changes.... so what is the big deal about this article. I think it would be good to put up an article about how to optimize your web servers both in layout and actual configurations to allow for Slashdot levels of traffic. I doubt this will happen, just as the mirroring content on featured stories to help ease bandwidth or other similar suggestions. The saddest part is that once you spend the time to really optimize a machine or machines...it takes far less time to maintain them.