Slashdot Mirror


Server Monitoring With Munin And Monit

hausmasta writes "In this article I will describe how to monitor your server with munin and monit. munin produces nifty little graphics about nearly every aspect of your server (load average, memory usage, CPU usage, MySQL throughput, eth0 traffic, etc.) without much configuration, whereas monit checks the availability of services like Apache, MySQL, Postfix and takes the appropriate action such as a restart if it finds a service is not behaving as expected. The combination of the two gives you full monitoring: graphics that lets you recognize current or upcoming problems (like "We need a bigger server soon, our load average is increasing rapidly."), and a watchdog that ensures the availability of the monitored services."

9 of 124 comments (clear)

  1. But can I run this on Windows? by Steve_Jobs_HNIC · · Score: 5, Funny

    .... been waiting a while to say that.

  2. Cacti by mtenhagen · · Score: 4, Insightful

    How is this different from cacti?

    --
    200GB/2TB $7.95 Coupon: SAVE90DOLLAR
    1. Re:Cacti by isolationism · · Score: 3, Informative

      Munin isn't at all different from Cacti, really, except that Cacti is 100% web based and perhaps a bit more mature (I use Cacti and like it a lot more than at least 4-5 other similar products out there). Cacti won't do service-testing though; maybe this is a good walkthrough for people who just want something up and running in 15 minutes (I wouldn't know, I'm not inclined to read the whole thing since a cursory glance shows there's nothing here that I don't have a running alternative for already).

  3. Automatic restarts are bad by Erik+Hensema · · Score: 5, Insightful
    • A restart usually kills hanging processes, making the actual cause of the hang impossible to determine afterwards.
    • Automatic restarts make some admins lazy. Instead of debugging the problem, they accept apache/whatever service is restarted once a day.

    However, making graphs and monitoring your services is a very good thing. Graphs are invaluable in determining trends, such as memory leaks or steadily increasing load. Monitoring saves lots of downtime and unhappy customers ;-)

    Personally I use nagios for monitoring and DIY scripts for graphing. The latter mostly because I started making graphs before decent of-the-shelf software was available ;-)

    PS. what's this subject got to do with debian?

    --

    This is your sig. There are thousands more, but this one is yours.

    1. Re:Automatic restarts are bad by Jeff+DeMaagd · · Score: 3, Insightful

      Point taken, but I think an automatic restart is necessary to minimize intrusions into off-work-time with maintainaince and such. If the service hangs and there's no one there to tend to it, then it will stay hung until someone notices. This is not good if you want to keep going and not lose potential business if the site is down.

      Anyway, I'm glad I'm not a server admin. I'd like to live my private life NOT being on-call.

  4. Seems a lot less clunky than Nagios or Cacti by Burv · · Score: 3, Informative
    I've tried both Nagios and Cacti for years. They work great, are very feature rich, and seem to have a strong community.

    The one thing that annoys me about them is that, out of the box, they don't have much configured, and to install/configure stuff, you have to jump through a lot of hoops.

    In the case of cacti, it's mostly through a web-based GUI, which is OK if you have one server with one thing you want to measure, say %CPU usage, that you want to measure, but if you want to do it for a server farm or even a couple machines, it's a pain in the butt. They do have a templating system, but you still have to do a lot through the GUI. I've posted on their forums before to this effect, and they have suggestions for making changes like this en masse, but again, it doesn't work out of the box. Bottom line, the designers of cacti seem to be focused on the Web GUI, which is kinda nice for newbies, but a huge pain for people like me that like to script things.

    It's the same thing with Nagios, although at least they let you change text files for the settings. Although the number (about 20) of files is reflective of how feature rich it is, it also makes it a hassle to set up. Here's an article at samag.com that illustrates the process you need to go through... imagine this for a couple hundred servers, and you can see how arduous setting up nagios could be.

    So, although munin may not be as mature and well known as cacti, and monit not as popular as nagios, I think they're still worth trying out..

  5. Re:Restarting services... by NevarMore · · Score: 3, Interesting

    Egads! My education is useful!

    We're discussing such issues in a class I'm taking on software fault tolerance. In discussing selective restarts and backup processes Apache is frequently cited as an example of how software should fail gracefully, consistently, and then handle that failure itself. The lecture slides can be found here: http://wwwse.inf.tu-dresden.de/index.php?language= English&site=courses&course=ss06vl02

    Apache has some memory leaks in it. It is not bad, it happens, especially in a piece of software like that which is expected to run constantly and NEVER fail. So what the Apache software does is every so often, or when it detects that its memory usage is getting out of hand, it fires up a second copy of itself and then kills itself letting the new not-yet-leaky copy take over.

    So to you (IT/admin) that daemon may run forever, but thats because my people (CS/developer) did our jobs (for once) and ensured that the application cleaned up its own messes.

  6. Add OpenNMS by nrc · · Score: 3, Informative

    Add OpenNMS to the list of stuff that this duplicates or overlaps with. Not that anyone in OSS needs permission to reinvent the wheel. You've got an itch - you scratch as it pleases you.

  7. Re:Insignificanct in the trails of NAGIOS? by Stinking+Pig · · Score: 3, Interesting

    because in software-land, "mature" is rapidly followed by "obsolete." I love Nagios, but I'm hesistant to recommend it to anyone who's not comfortable spending a week on building and configuring software.

    Packages for it are often broken or from the old 1.3 tree, which makes for confusion when following examples that use 2.0 syntax.

    Configuration is extremely challenging to start from scratch with, especially if you want to do anything custom.

    There are a number of external dependencies, particularly if you want to compile the plugins.

    That said, Nagios still whips the pants off quite a few commercial monitoring products I've evaluated.

    --
    "Nothing was broken, and it's been fixed." -- Jon Carroll