What Are Typical Load Averages for Servers?

← Back to Stories (view on slashdot.org)

What Are Typical Load Averages for Servers?

Posted by Cliff on Wednesday November 14, 2001 @12:37PM from the deciphering-the-numbers dept.

Jon Hill asks: "I'm curious to figure out how to guage the performance of my servers and know at what level of usage I should think about hardware upgrades. 90% of our servers run Linux and various services standard to Linux such as sendmail, samba, DNS, etc. One of our main servers (router/firewall/sendmail/spop) has been running with a load average of .5 to 1.5 regularly. It supports 200 users and is an SMP Intel machine with 2GB of RAM. I'm not sure if it needs software/kernel tweaking or hardware modifications and I can't seem to find any reference information. Suggestions?"

4 of 25 comments (clear)

Min score:

Reason:

Sort:

It really depends. . . by foo+fighter · 2001-11-14 13:01 · Score: 3, Informative

What you are asking about is 'performance tuning'. Do a search at Google on that term and you will find plenty of information online. http://linuxperf.nl.linux.org/ might be a good place to start.

Average load is unique to a system. To figure out what that average is you need to monitor the server for a while.

I don't know about Linux, but I've done a lot of NT Server tuning and I think some of the general principles can be shared across platforms.

* Monitor CPU, Memory, Disk, and Network load over time (these are the four primary sources of bottlenecks in computer systems). Figure out what is regular for *your* systems. I take samples a few specific times a day every few days.

* If one metric is consistently high, at or near 100% utilization that's a good sign of a possible bottleneck. Take care of that bottleneck by increasing processor speed, adding more memory, adjusting the settings/algorythms of your software, etc.

* Make one change at a time, and then measure the results.

* Document your changes, so then if you actually slow the machine down you can go back to the original status

* When you remove a bottleneck, it is replaced by another. That's the name of the game.

* The best way to tell if you have a bottleneck is user input. Are they complaining that database lookups take to long? That web pages aren't delivered fast enough? Or are they quietly content (right, you wish! :-)?

Good Luck!

--
obviously no deficiencies vs. no obvious deficiencies
Sounds about right by KnightStalker · 2001-11-14 13:02 · Score: 5, Informative

The load average represents the number of processes waiting in the run queue over the last x amount of time. I think the three numbers reported in "uptime" are the last minute, five minutes, and ten minutes, but I could be wrong. It sounds like yours is fairly low, because you have more than one processor (a load average of 2 or more would be 100% usage for a dual proc machine, at least on Solaris).

The Solaris server where I work has 16 processors and the load average usually sits around 10-15. I'd be worried if my single-proc linux workstation had that high of an average, though... :-)

--
* And remember, it's spelled N-e-t-s-c-a-p-e, but it's pronounced "Mozilla."
It all depends on what you find acceptable. by Above · 2001-11-14 13:18 · Score: 2, Informative

Load average is a measure of the number of things 'waiting' to run. Depending on your OS this may or may not include a number of intersting corner cases. In particular, this almost always includes things like disk i/o, and tty i/o. A user with a CPU bound process won't notice disk i/o issues, and vice versa.

So what is the range of acceptable? Well, for a single user workstation a load average of 1 (one thing waiting) probably means the user is waiting, and you may want more CPU or disk bandwidth. On the other hand, a highly multi-user machine (say a news server) may get optimal transfer rates out of the disk hardware by having a lot of things waiting so it can schedule reads and writes.

Look at all the resources on your machine, use tools like vmstat, iostat, netstat, etc. See why processes are waiting. Look at your user load and see if it's ok. For instance, with a 100Mbps ethernet, you could serve 10 users at 10Mbps each, or 100 at 1Mbps each. The later will have a higher load average, but if 1Mbps per user is fine with you, then there is no problem.

To give some real world examples. I've seen news and mail servers both run load averages well over 200, and sill deliver acceptable performance. I've also seen shell servers with load averages as small as 5 that are very sluggish (often because they are swapping).
Re:Heh.. you call that a load average.. by AtariDatacenter · 2001-11-15 05:20 · Score: 3, Informative

Along those lines, at work, I had a 5-way box (5x250mhz, 5gb RAM) that supported 1,100+ simultaneous users telnetting into it, running an application. A typical user had four processes running. The application was interactive and the user would type a few things, hit against the back-end database [on another box], and go off and do more stuff.

The load average easily soared past 100 and up. It was becoming a nightmare. Without the new hardware ready, there wasn't much I could do.

But, I found that if I adjusted the time slices to 1/10th their normal level, the system had much better response, and the load average sank down into the 10's

My understanding of why this worked is because Solaris' process dispatcher worked a little differently, in that it also reserved 'unused' time for the process that just got off of the CPU, just in case it wants right back on. The idea is to preserve L2 cache.

In this case, when it was handling keystrokes back-and-forth, a small CPU requirement, ended up hogging a larger slice, and processing power was thrown away.

It was nice to see a change like that do wonders on the box.