Slashdot Mirror


What Are Typical Load Averages for Servers?

Jon Hill asks: "I'm curious to figure out how to guage the performance of my servers and know at what level of usage I should think about hardware upgrades. 90% of our servers run Linux and various services standard to Linux such as sendmail, samba, DNS, etc. One of our main servers (router/firewall/sendmail/spop) has been running with a load average of .5 to 1.5 regularly. It supports 200 users and is an SMP Intel machine with 2GB of RAM. I'm not sure if it needs software/kernel tweaking or hardware modifications and I can't seem to find any reference information. Suggestions?"

5 of 25 comments (clear)

  1. It really depends. . . by foo+fighter · · Score: 3, Informative

    What you are asking about is 'performance tuning'. Do a search at Google on that term and you will find plenty of information online. http://linuxperf.nl.linux.org/ might be a good place to start.

    Average load is unique to a system. To figure out what that average is you need to monitor the server for a while.

    I don't know about Linux, but I've done a lot of NT Server tuning and I think some of the general principles can be shared across platforms.

    * Monitor CPU, Memory, Disk, and Network load over time (these are the four primary sources of bottlenecks in computer systems). Figure out what is regular for *your* systems. I take samples a few specific times a day every few days.

    * If one metric is consistently high, at or near 100% utilization that's a good sign of a possible bottleneck. Take care of that bottleneck by increasing processor speed, adding more memory, adjusting the settings/algorythms of your software, etc.

    * Make one change at a time, and then measure the results.

    * Document your changes, so then if you actually slow the machine down you can go back to the original status

    * When you remove a bottleneck, it is replaced by another. That's the name of the game.

    * The best way to tell if you have a bottleneck is user input. Are they complaining that database lookups take to long? That web pages aren't delivered fast enough? Or are they quietly content (right, you wish! :-)?

    Good Luck!

    --
    obviously no deficiencies vs. no obvious deficiencies
  2. Sounds about right by KnightStalker · · Score: 5, Informative
    The load average represents the number of processes waiting in the run queue over the last x amount of time. I think the three numbers reported in "uptime" are the last minute, five minutes, and ten minutes, but I could be wrong. It sounds like yours is fairly low, because you have more than one processor (a load average of 2 or more would be 100% usage for a dual proc machine, at least on Solaris).

    The Solaris server where I work has 16 processors and the load average usually sits around 10-15. I'd be worried if my single-proc linux workstation had that high of an average, though... :-)

    --
    * And remember, it's spelled N-e-t-s-c-a-p-e, but it's pronounced "Mozilla."
  3. Sounds too low by PD · · Score: 3, Interesting

    If you've got an SMP machine, and your averages are .5 to 1.5, then you've either got too big of a machine for the job, or you should put more stuff on it to utilize it better.

    A processor utilized 100% of the time will give you a load average of 1.0. If you've got two processors, you should aim for a load of 2.0 average.

    So, good news! You don't have to do any tweaking for performance, unless you have specific issues with the speed of the server. You can probably add more to the server without affecting other processes (unless you've got a lot of I/O going on). You only gave CPU stats, so I am assuming that's what you're concerned about.

  4. Re:Heh.. you call that a load average.. by AtariDatacenter · · Score: 3, Informative

    Along those lines, at work, I had a 5-way box (5x250mhz, 5gb RAM) that supported 1,100+ simultaneous users telnetting into it, running an application. A typical user had four processes running. The application was interactive and the user would type a few things, hit against the back-end database [on another box], and go off and do more stuff.

    The load average easily soared past 100 and up. It was becoming a nightmare. Without the new hardware ready, there wasn't much I could do.

    But, I found that if I adjusted the time slices to 1/10th their normal level, the system had much better response, and the load average sank down into the 10's

    My understanding of why this worked is because Solaris' process dispatcher worked a little differently, in that it also reserved 'unused' time for the process that just got off of the CPU, just in case it wants right back on. The idea is to preserve L2 cache.

    In this case, when it was handling keystrokes back-and-forth, a small CPU requirement, ended up hogging a larger slice, and processing power was thrown away.

    It was nice to see a change like that do wonders on the box.

  5. General recommendation... by larien · · Score: 3, Insightful
    As a general recommendation I heard once, your load average shouldn't get more that 2xnumber of CPUs. i.e. on a single CPU box, it shouldn't get higher than 2, for a 64-CPU high-powered server, it shouldn't get above 128.

    I've found it a reasonably good guide to when there's an issue on Solaris boxes; I think linux uses similar numbers to calculate run queue averages, but other OS's (eg, IRIX) use different formulas to calcualte it so you might need to tweak this recommendation.