Scaling Server Performance

← Back to Stories (view on slashdot.org)

Posted by ryuzaki0 on Friday January 17, 2003 @06:40AM from the caching-for-fun-and-profit dept.

An anonymous reader writes "When Ace's Hardware's article Hitchhiker's Guide to the Mainframe was posted on Slashdot, they got 590,000 hits and over 250,000 page requests during one day. This kind of traffic caused only a 21% average CPU load to their Java-based web server, which is powered by a single 550MHz UltraSparc-II CPU. In their newest article, Scaling Server Performance, Ace's Hardware explains how this was possible."

15 of 341 comments (clear)

Min score:

Reason:

Sort:

only 600, 000 per day? by vanyel · 2003-01-17 06:47 · Score: 4, Informative

When I was benchmarking web servers in *1994*, servers could handle 100,000/hr, which is only about 30/sec. You may need a T3 to handle the bandwidth, but any server that can't handle it today is misconfigured.
1. Re:only 600, 000 per day? by vanyel · 2003-01-17 07:13 · Score: 4, Informative
  
  Yes, but each one of those wizbang annoyances is just another hit to the server. dynamic generation of pages is the real server killer, depending on how much hoop-de-loop you're going through to make them.
That's hardly impressive by shoppa · 2003-01-17 06:56 · Score: 5, Informative

When one of the sites that I serve, The Computer History Simulation Project, was slashdotted, I was serving 40-50 pages per second (which is nearly ten times the rate attributed to Ace's Hardware) on a 4-year-old webserver (a K6II-500) that cost about $200 to put together. And the server itself was ticking along with only a few percent CPU usage.
OTOH, my puny little SDSL connection was seriously maxed out.
Even old hardware can happily serve up hundreds of documents a second, if the pages are static.
That performance is supposed to be impressive? by backtick · 2003-01-17 07:18 · Score: 2, Informative

We have OLD Cobalt Raq3's (300 MHz AMD K6, 128 MB Ram, single IDE drive) running the latest Cobalt OS, and we JUST had one of these boxes get hammered this week; in a 12 hour period, it handled 625,000 hits (mostly CGI's, but it had a reasonable amount of static content), and at the same time handled 35,000 POP requests, sent 4,500 emails, and did some other random functions (and things like hostname lookups are enabled for weblogs, FTP uploads are happening for weather-wite webcams that were associated with the heavy traffic, etc, so there's obviously not a huge amount of "tune it till it's ONLY gonna do one thing" going on here). Now, the box was taking a whipping compared to it's normal load, but c'mon. I can't say the "Poor little 550 MHz UltraSPARC story" makes me tear up :-)
OLD ARTICLE by mgkimsal2 · 2003-01-17 07:18 · Score: 2, Informative

Tuesday, November 27, 2001 8:07 AM EST
------
It was published over a year ago, and undoubtedly was based on their spring/summer 2001 trials. Even then this info wasn't revolutionary, and is even less so now.

--
creation science book
different meanings of "dynamic" pages by Confuse+Ed · 2003-01-17 07:22 · Score: 4, Informative

They've really simply discovered that dynamically generating essentially static content is a bad idea : the 'dynamic' pages they are talking of are just articles which once written stay the same, and so are serving identical pages to each user.

Using scripting with database look ups to create such pages is obviously not good - much better is to compile your data in to static pages and serve those. I have done this for my own website using XSLT to generate the html pages with consistant links and menu's etc. - but you do have to remember to re-build it after making any changes or adding new content (I use gnu make to handle the dependancies of one page upon another so it doesn't rebuild the entire site everytime.)

They've taken the alternative approach of still using a database for the requests, but then caching future requests for the same page-id's, which has the advantage of being compatible with their original dynamic generation system, but they don't mention how they handle the dependancy / cascading alterations problem if they change the content (though they could always flush the entire cache of course....).

Neither of these approaches can help you though if you have real dynamic pages where every request is unique or there are are too many possible pages for caching to be feasible (for example amazon or google).
web serving has become bloated by Jahf · 2003-01-17 07:31 · Score: 3, Informative

Seriously ... the numbers aren't that great. I used to admin a DEC Alpha Digital Unix server running at a whopping 300Mhz and it routinely served over 1.5M hits per day along with email, authentication and accounting for over 5,000 people and we rarely if ever saw it over a 0.5 load average. This was 4 years ago.

It's not apples to apples, since we weren't serving the same set of pages (we had around 500 personal homepages, each with a varied combination of static HTML, images and CGI programs) but honestly, if the numbers in this article are supposed to be impressive, we've grown too accustomed to web server feature bloat.

--
It is more productive to voice thoughtful opinions (reply) than to judge (moderate) others.
Thread-per-request model is a bottleneck by mmcshane · 2003-01-17 07:32 · Score: 3, Informative

Queuing approaches have proven to be much more scalable in other areas - no reason to think it wouldn't work for web servers. Check out SEDA: An Architecture for Highly Concurrent Server Applications for a working implementation in Java that outperformed Apache [insert benchmark caveat here].

More on event-driven servers that minimize data copies and context-switching here.
Re:How they did it... by Anonymous Coward · 2003-01-17 08:09 · Score: 2, Informative

page request != hit
Re:Isnt the real problem BANDwidth? by pjrc · 2003-01-17 08:22 · Score: 3, Informative

One word: mod_gzip.
Yes, mod_gzip is great and I use it on my own server, but for any "normal" website the main advantage is an interactive speed-up for dialup users. It really doesn't save huge amounts of bandwidth (in this case, enough to matter for withstanding the slashdot effect).
As an example, the page slashdot linked to is 22443 bytes of compressable html, and approx 84287 bytes of images (not including the ads and two images that didn't load because they're not handling the slashdot effect so well as they thing they can). At -9, the slowest and best compression (remember, this is a dynamic JSP site, not static content you can compress ahead of time), the html compresses to 5758 bytes, thereby reducing the total content from 106730 bytes to 90045.
That's only a 15.6% reduction in bandwidth.
Also, a typical HTTP response header, which can't be compressed, is about 300 bytes (not including TCP/IP packet overhead, which we'll ignore hoping that HTTP/1.1 keepalives are putting it all in one connection...). There were 18 images (actually 20, but junkbuster filtered 2 out for me). That's 19 HTTP headers, at 300 bytes each, all uncompressable. Adding in HTTP overhead we're at (approx) 112430 without compression and 95745 with mod_gzip. So the uncompressability of the headers reduces the bandwidth savings to 14.8%.
The big advantage that makes mod_gzip really worthwhile for a site like that is the a dialup user can get all the html in about 2 seconds, rather than 5-6 (assuming the modem's compression is on). Then they can start reading, while the remaining 82k of images slowly appear over the next 20-30 seconds.
Now in some cases, like slashdot's comments pages, mod_gzip makes a massive difference. But for most sites, the majority of the bandwidth is images that are already compressed. That 10% to 20% reduction in bandwidth from simply installing mod_gzip is pretty small compared to a bit of effort redesigning pages to trim the fatty images.

--
PJRC: Electronic Projects, 8051 Microcontroller Tools
Re:The problem is dynamic content by pjrc · 2003-01-17 08:51 · Score: 2, Informative

Pretty much any server can serve hundreds, or even thousands of pages per second (I benchmark a basic PC IIS 5 server serving 17,000+ pages per second),
Have you ever tried a test where the clients kept their connections open for a reasonable length of time??
In the real world, virtually all clients are connected via links ranging from slow dialup to 1.5 Mbit/sec. They hold connections open and tie up server memory resources for a lot longer than a fast-as-possible benchmark running on the same machine or over fast ethernet.
Any server running on a single box is probably going to have trouble with 17000+ pages per seconds to modem users, who require many seconds to transfer the page. If the average connection open time is 2 seconds, that's 34000 open connections. Even if the server used only 32k of RAM per connection (barely enough to buffer a few packets and allocate "window" inside the TCP layer in the OS, and maintain OS-level info and buffering for the open file), that'd be over 1 gigabyte of memory. I suspect a combination of Windows (TCP/IP & file I/O), IIS, and ASP.NET uses a lot more than 32k per connection.

--
PJRC: Electronic Projects, 8051 Microcontroller Tools
Ace HW needs a clue by LunaticLeo · 2003-01-17 09:04 · Score: 3, Informative

Ace's Hardware needs to research real servers before talking about their "scalable" servers. Their numbers are really saying that their box performs like a dog.

For those of you interested in this topic here is a few pointers and words of wisdom.

Server scalabilty and performance has three basic metrics, thruput (urls/sec), simultaneous connections, and performance while overloaded. Of course, you could add latensy but I'd argue that with the correct design latency is directly proportional to the real work you are doing, bad design insertes arbitrary waits.

I know of a HTTP Proxy by a large ISP that does user authentications & URL authorization (re: database), header manipulation, and on-the-fly text compression at 3000 urls/sec for 2000-4000 simultaneous connections and maintains that performance under load by sheding connections, all this on a dual 1GHz Intel PIII box running a Open Source OS that starts with "L". That is a maximum of 260 Million URL/day, three orders of magnitude greater performance than Ace's Hardware stats.

The simple answer to the question "How do I create a scalable fast network server?" is Event-driven GOOD & Threads BAD. Event driven network communication is two to three orders of magnitude better performing than thread/thread-pool based network communications. See Dan Kegel's C10K web page. That means you must use non-blocking IO to client sockets and databases. Once you accomplish that small feat, dynamic content just consumes CPU; with 2.8 Ghz Xeon processors you have plenty of cycles for parsing HTML markup or whatever. Threads cause cache thrashing, and context switching. While thread programmers don't see the cost in their code, just read the kernel code and you'll see how much work HAS TO BE DONE to switch threads. Event driven programming just takes some state lookups (array manipulation) and a callback (push some pointers onto the stack and jump to a function pointer).

Desgin is FAR MORE IMPORTANT than which runtime you use (execution tree, byte code, or straight assembly). I have done some very high load network programming with Perl using POE.

Python has Twisted Python

Java has the java.nio and the brilliant event/thread hybrid library SEDA by Matt Welsch.

I am also looking into the programming language Erlang which builds concurrancy and event driven programming into the language. Further, Erlang is used by some big telco manufacturers to great effect (high performance and claimed 99.9999999% nine-nines reliability on a big app).

--
-- I am not a fanatic, I am a true believer.
CPU bound==something very very wrong by rufusdufus · 2003-01-17 10:55 · Score: 2, Informative

This story is dopey. If you have a web server and it is hitting a CPU bottleneck, you have done something wrong.

Ok, if the server actively plays chess against a hundred people, I'll let you be cpu bound.
How many per second? by steveha · 2003-01-17 11:04 · Score: 3, Informative

They said that the peak load was 11 hits per second, with 4 pages being served. They also said that their CPU was 21% loaded to serve this much traffic.

This says nothing about what they can serve under ideal conditions; this is what they actually served up during an actual slashdotting. If you want to max out their server, you will need to get more /. readers to hit them all at once, or perhaps they need a bigger pipe connecting them to the Net.

Read the article; on ApacheBench with one particular page they tested, the server tested out at five dozen pages served up per second.

I don't know about you, but I was somewhat impressed by all this. A $1000 Sun does seem to have been a wise choice for them.

steveha

--
lf(1): it's like ls(1) but sorts filenames by extension, tersely
Content expiration by ttfkam · 2003-01-17 12:26 · Score: 2, Informative

This is why people should set an expiration time on their static content. If, for example, I set up the images to expire one hour from the access time, multiple visits to the page (and images shared between multiple pages) would only be requested once. An ISP's proxy servers down the chain would only help in this regard.

In addition, for static content, "LastModified" is easy to compute. Clients can request a page, send an "If-Modified-Since" header with the timestamp of the static item, and if the item hasn't changed, return a 304 response and no data.

The same can be done for dynamic content, but it requires a bit more work. Most web servers do these things for static content out of the box.

As was said in the article, the fastest request is the request that never has to be made.

--

- I don't need to go outside, my CRT tan'll do me just fine.