Slashdot Mirror


Nagios 3 Enterprise Network Monitoring

jgoguen writes "Nagios, originally known as Netsaint, has been a long-time favourite for network and device monitoring due to its flexibility, ease of use, and efficiency. Nagios provided, and still provides today, a low-cost, versatile alternative to commercial network monitoring applications. Nagios 3 takes a huge step forward compared to Nagios 2, providing improved flexibility, ease of use and extensibility, all while also making significant performance enhancements. Due to its extensibility and ease of use, no device or situation has yet been found that cannot be monitored using Nagios and a pre-made or custom script, plug-in or enhancement." Read on for the rest of jgoguen's review. Nagios 3: Enterprise Network Monitoring author Max Schubert, Derrick Bennett, Jonathan Gines, Andrew Hay, John Strand pages 339 publisher Syngress rating 8 reviewer jgoguen ISBN 978-1-59749-267-6 summary Making Nagios 3 work for you and your business. The first chapter is devoted to new features in Nagios 3. The major changes implemented for Nagios 3, which includes changes to data storage options and locations, checks, configuration objects, and macros, are discussed here. Operational, performance, and usability enhancements are also discussed here. Users upgrading from Nagios 2, or users who may already be familiar with Nagios 2, will gain the most from this chapter. New users will still gain value from this chapter, however, since a number of changes also involve some of the major features of Nagios. In addition, users who may be referring to configuration file samples created for Nagios 2 will save a great deal of time referring to this chapter for changes. Using Nagios 2 configuration files directly prevents users from enjoying some new features of Nagios 3. Users who will only be writing plug-ins and scripts for their local Nagios deployment might not find Chapter 1 very useful.

Chapters 2 and 3 deal with scaling Nagios to work efficiently within large deployments. First, designing a Nagios configuration for large organizations is shown. This is something that all Nagios administrators should make use of when designing configurations, not only administrators in large organizations, because a properly done configuration for a small organization will easily scale up as the organization grows. I was impressed to see that the authors stress the importance of the end user's input when designing configurations. Administrators who ignore this piece of advice risk the success of Nagios in their organization. Various diagrams help to explain the relationships between the various Nagios configuration objects. A good amount of detail is provided regarding allowing various groups within an organization to have semi-independent control over how Nagios interacts with their hosts and services, and how Nagios alerts their staff. The authors have included numerous configuration file snippets, which allows a Nagios administrator to very quickly create a configuration file and then tweak the configuration parameters to suit local requirements.

Scaling the Nagios graphical user interface (GUI) follows a very simple concept: use a "less is more" approach. Although the specific details here deal with Nagios, the general idea is equally applicable to anyone displaying information they expect their users to actually pay attention to. In general, users should be able to see as much as they want (limited by resources and permissions) but only be shown what they need to know about by default. For example, the system administrator for marketing probably does not need to know when the development disk image server goes down, while the development system administrator would probably be very interested. Utilizing user accounts allows the administrator to allow various groups to have access to Nagios filtered by its fine-grained permissions system. Users from various groups can also be shown only what they need to be shown by default, without the need to select a particular area first. Utilizing user accounts also prevents users who need to view Nagios from having full administrative control, and allows for records of each user's actions to be made. Using a patch provided with the book's download package will enable Nagios to have read-only accounts as well, which is great for organizations who would like to grant certain users (or groups) access to view Nagios but not make any changes. As an example, an organization's help desk could use Nagios to determine quickly whether users are unable to access services because of an outage, or if further troubleshooting is necessary.

The authors continue on here to discuss clustering, failover, and the future of the Nagios GUI. I'm not convinced that these belong in a chapter devoted to scaling the Nagios GUI, since these seem to mostly deal with scaling the entire Nagios deployment. Regardless, they are all very important topics, especially when Nagios is heavily relied upon. Clustering allows remote sites to have a Nagios instance local to the site monitoring hosts and devices rather than requiring a central Nagios instance to monitor remote hosts and services. Not only would monitoring hosts and services take much longer due to the WAN links between the central instance and remote locations, but also due to the security implications of allowing the checks to be done. The authors don't discuss the security side of clustering, but it's still something that every Nagios administrator (and everyone else!) should keep in mind. The clustering section deals primarily with the rationale behind clustering and how to configure the local and remote instances of Nagios properly, but the authors include a good deal of information here that a less experienced Nagios administrator might overlook. Most notable is their discussion about the display of service status when a service is reachable from the master server but not from a remote instance. While Nagios can translate the remote instance's check result to be displayed from its own perspective, it may be more desirable to have the master Nagios GUI display the results from the perspective of the server which made the check. After implementing clustering, some sort of fallback mechanism is required. Failover and redundancy are the two main choices, and that's what the authors discuss next. They don't spend much time on redundancy, since this would require each redundant Nagios instance to perform its own set of checks, which can significantly raise the load on both the monitored hosts and the network in general. Given the problems it can introduce, the authors have spent more time on redundancy than most administrators should spend considering. Failover is a much better solution, and the authors do a great job of covering the setup of a proper failover setup. As usual, they make sure to remind readers of some things that are easily overlooked, especially when you're trying to get Nagios back up and running when the master server crashes.

Chapters 4 and 5 discuss Nagios plug-ins, add-ons, and enhancements. These chapters alone are worth the price of the book because of how much time they can save. It's much faster to copy a script and make minor tweaks than it is to try reinventing the wheel, and with the number of scenarios covered here combined with the Nagios user community there aren't very many things that haven't been done already. Whether you want to test command-line interfaces, CPU usage, memory utilization, bandwidth utilization, HTML pages, LDAP services, or even specialized hardware, there's probably already a plug-in written for it. Most common scenarios actually have a plug-in already included in this book. The available add-ons and plug-ins are equally varied, providing ways to monitor hosts across security zones, configure read-only displays that live in a security zone other than the one Nagios is in, interface with Cacti, and even read out alerts. Even more scenarios can be handled by other scripts provided by the Nagios community.

Chapter 6 goes into detail on how to integrate Nagios into an enterprise environment. This chapter goes into just enough detail to get Nagios configured to work with a large number of third-party services, such as LDAP authentication, Cacti, Puppet, and Splunk. Emphasis here is always placed on the human element; how to use Nagios to help help desk and/or NOC staff do their jobs more efficiently and effectively, and how to gain maximum support for Nagios within the organization. The importance of the human element, in all its forms, simply cannot be overstated, and the the authors have done a wonderful job of outlining a good way to make Nagios an integral part of an organization. A lot of the material towards the end of the chapter, especially the section on smaller Network Operation Centres, could be used by anyone looking for ways to help a small group work together effectively.

Chapter 7 is another chapter with a lot of content easily applicable outside of a Nagios environment. The chapter begins with the authors reminding you to know your network and to watch out for session hijack attacks, then show you how to use Nagios to do both. Nagios can't replace a competent network administrator, but it can make their lives easier and the authors show you here how the configuration you've already done on Nagios already shows you a potential session hijack attack and how it forces you to properly know your network. Nagios forces you to know your network not only by how it's built and by what devices are in use, but it also requires that you have a solid handle on what constitutes normal conditions for all your devices and services.

Another area which is very important to companies, especially companies operating in the United States, that Nagios can assist with is regulatory compliance. The authors outline how a company could use Nagios to assist with compliance with Sarbanes-Oxley (SOX) with COBIT or COSO, Payment Card Industry (PCI) Data Security Standard (DSS), Director of Central Intelligence Directive (DCID) 6/3 and Department of Defence (DoD) Information Assurance Certification and Accreditation Process (DIACAP). Nagios alone isn't enough to be compliant, at the very least detailed documentation will also be required, but the authors give a good overview of how Nagios can assist with compliance in all of these regulations.

The final chapter helps to bring the rest of the book together by walking through a full Nagios configuration for a fictional Fortune 500 corporation. The bulk of this chapter covers the pre-deployment stage of a Nagios deployment, but that doesn't mean that there isn't a lot to learn about deploying Nagios. A major hurdle towards deploying Nagios in an organization is the pre-deployment phase, and the authors outline here how to easily turn this major challenge into a series of simple steps to increase the chances of Nagios' success in your organization. From the very beginning, you can see how involving the customer early and starting small, along with everything else, becomes a part of a process. Although it's specific to Nagios, the process followed here could be easily adapted to integrating any sort of monitoring service. The remainder of the chapter is devoted to how you might integrate Nagios into a Fortune 500 company, finishing the book off with some good advice for integrating Nagios.

Despite all the book's strengths, there is some room for improvement. In chapter 2, it may have been more effective to outline the relationships between the Nagios configuration objects before discussing configuration planning. I found it much easier to think of a configuration for a large organization after knowing about how Nagios' configuration objects relate to each other.

Throughout the book, the authors have included configuration file snippets, scripts, and example script output in the main text. While all of these are quite useful and serve to enhance the book, I think it would have been better if these were all included in an appendix instead, perhaps keeping only the relevant parts of configuration snippets in the main text for clarification.

At the end of chapter 3, the sections on the future of Nagios and the CGI front end are informational and interesting, but they would be better placed in a separate chapter dealing with the potential future of Nagios in general. These and the other major areas of Nagios combined would provide more than enough material for a full chapter on their collective futures.

Overall, this is a great book for anyone using Nagios as more than a casual user, and is still very informative for the casual user. A few of these chapters alone would be worth the price of the whole book.

Disclaimer: I worked with one author when I was asked to review this book.

You can purchase Nagios 3: Enterprise Network Monitoring from amazon.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.

147 comments

  1. Spam alert! by Animats · · Score: 0, Troll

    A review, by an associate of the author, of an obscure product, with a picture of the book plastered on the front page of Slashdot. Who was paid off for that?

    1. Re:Spam alert! by Hijacked+Public · · Score: 1

      Slashdot, as per the usual.

      --
      "Sacrifice for the good of The State" - The State
    2. Re:Spam alert! by nobodylocalhost · · Score: 1

      You sir, have no idea how right you are about Nagios... It spams, a lot. And depends on how well you know what you are doing, it will spam you from couple mail per hour to literally e-mail bomb you so you can't even open your e-mail client.

      --
      Where is the "Ignorant" mod tag?
    3. Re:Spam alert! by Ngarrang · · Score: 5, Informative

      A review, by an associate of the author, of an obscure product, with a picture of the book plastered on the front page of Slashdot. Who was paid off for that?

      Obscure product? What world have you been living in?

      --
      Bearded Dragon
    4. Re:Spam alert! by Anonymous Coward · · Score: 0

      I suffer from logizomechanophobia, you insensitive clod! I can barely bring myself to type on a computer, much less network them and run a full TCP/IP stack!

    5. Re:Spam alert! by _Sprocket_ · · Score: 1

      Point taken. However, I'm not sure exactly how obscure Nagios is. In the IT circles I run around, it's pretty well known. But then again, I'm in a fairly mixed environment.

    6. Re:Spam alert! by MrNaz · · Score: 5, Funny

      Nagios is only obscure if you are not a network admin, Linux geek or data center operator.

      So the real question is; what are you doing here?

      --
      I hate printers.
    7. Re:Spam alert! by IceCreamGuy · · Score: 2, Insightful

      ...of an obscure product...

      The only things I can think of that would make someone say something like this are:

      You're not a systems administrator

      Your new to systems administration

      You're a bad systems administrator

      You don't keep up with grade-A open source enterprise solutions

      you work for a company that has a budget big enough that you don't ever consider open source

    8. Re:Spam alert! by SgtAaron · · Score: 4, Insightful

      You sir, have no idea how right you are about Nagios... It spams, a lot. And depends on how well you know what you are doing, it will spam you from couple mail per hour to literally e-mail bomb you so you can't even open your e-mail client.

      I'm thinking that you may be one of those that need the book. :-) The amount and frequency of alert emails is easily configurable. And I think you need a new mail client! How about trying mutt? :)

      The "notification_interval" can be set to 0 so that nagios will only send one alert, period. Now, if you have a bunch of services/hosts down you will get a lot of messages unless you've taken steps to mitigate that. But isn't that better than *not* knowing your network has run home to momma?

      We've been using Nagios now for months and it may be the least buggy code running on any of our machines. Rock-solid, I tell you.

      regards,

    9. Re:Spam alert! by Anonymous Coward · · Score: 1, Insightful

      Nagios may be many things but ease of use...??? I suspect the author is suffering from crack-induced delusion.

    10. Re:Spam alert! by mosinu · · Score: 1

      Actually if you setup dependencies properly even a major network outage won't mail bomb you to badly.

    11. Re:Spam alert! by perldork · · Score: 1

      Agreed that the review does seem less than impartial but I can assure you that Syngress doesn't pay anyone enough to have money to 'pay people off.' :p

    12. Re:Spam alert! by perldork · · Score: 2, Interesting

      Set up dependencies is a must, use notification_delay wisely, and only send out emails or other 'push' notifications for problems that have to have immediate attention. I like to take the approach of monitoring everything that is important but only send emails out for problems that really truly require immediate attention from on-call staff.

    13. Re:Spam alert! by empaler · · Score: 1

      Then think of how fun it must be to have it send SMSes instead of email to you when you're at work - for a countrywide network! Yay!

    14. Re:Spam alert! by empaler · · Score: 2, Interesting

      Also, configure planned downtime FTW. I do enjoy that it can be set to auto-ignore flapping hosts.

    15. Re:Spam alert! by empaler · · Score: 1

      It's even in a recent issue of BOFH.

    16. Re:Spam alert! by isorox · · Score: 1

      Set up dependencies is a must, use notification_delay wisely, and only send out emails or other 'push' notifications for problems that have to have immediate attention. I like to take the approach of monitoring everything that is important but only send emails out for problems that really truly require immediate attention from on-call staff.

      We don't send emails. My corp has several nagios installations arround the place, our own in a 200 hosts/800 service one, another departments has a larger installation of 500 hosts.

      We don't send any emails, people ignore emails. Instead, we present a list of service problems in a heigherarchical way (a custom webpage that reads the current status). This allows the enginneers to see what's affected by a problem. When no services are affected, they can tackle the normal service problems (i.e. 3 of 50 nodes in a process cluster aren't running)

      Each service page has links into our wiki and logging system to search for problems, and find out how to enter them

      We don't send emails though. We trap "one-off" alerts by setting an alert to critical for 15 minutes, then automatically greening it.

    17. Re:Spam alert! by perldork · · Score: 1

      Very nice approach, sounds like a well thought out system. I am doing the same thing with regards to Wiki jump-off .. using the service and host 'notes url' links to provide jump off points to our Wiki so that anyone using Nagios can add SOPs, notes, etc, whatever fits the group using Nagios best.

      I might borrow your idea :) and do something similar using the JSON output from Nagios, that sounds very efficient.

  2. Nagios is great by kimvette · · Score: 4, Insightful

    Nagios is great but even version 3 is by no means easy to configure. Like all too many F/OSS projects, the documentation is lacking or even incorrect in spots, and supplied examples barely scratch the surface of what the application can do.

    I've been running it and it's great - I have it monitoring a bunch of servers (email, hosting, backup, file, etc.) with custom scripts and it works great -- once it's configured.

    --
    The Christian Right is Neither (Christian nor right). See: Matthew 23, Matthew 25, Ezekiel 16:48-50
    1. Re:Nagios is great by kimvette · · Score: 2, Insightful

      Ooops. submitted to early.

      Nagios is expecially helpful in a smaller environment where you have limited personnel; as long as nagios is up and running you can have it email, page, or text you so that you know there's an issue without having to have personnel monitoring it all manually - and it provides a decent log via the web interface.

      My main point is this: if this book is as good as the reviewer indicates, it should be very well worth buying if you need a F/OSS server monitoring solution.

      --
      The Christian Right is Neither (Christian nor right). See: Matthew 23, Matthew 25, Ezekiel 16:48-50
    2. Re:Nagios is great by Anarke_Incarnate · · Score: 1

      Get Hyperic or one of the other "Small 4" monitoring apps. Never look back.

    3. Re:Nagios is great by Fez · · Score: 1

      The initial configuration of Nagios can be quite a pain, but as I said in another post farther down the page here, with judicious use of templates, it is now very easy to manage once configured.

    4. Re:Nagios is great by SgtAaron · · Score: 1

      Nagios is great but even version 3 is by no means easy to configure. Like all too many F/OSS projects, the documentation is lacking or even incorrect in spots, and supplied examples barely scratch the surface of what the application can do.

      Hmmm, I have to say I was pleasantly surprised by the documentation. We had Argus running here for monitoring for awhile and I finally got tired of its very obscure docs and its bugs. Nagios has been an entirely different experience.

      And the Nagios mailing list is very well-read, it seems to me.

      I've been running it and it's great - I have it monitoring a bunch of servers (email, hosting, backup, file, etc.) with custom scripts and it works great -- once it's configured.

      Yes, I will admit it took a bit of time to get the hang of it. But I also remember it took a bit of time when I first tackled BIND and apache way back when! And I agree, we all sleep better at night with Nagios around (except when I hear a whoop-whoop :)

    5. Re:Nagios is great by gbjbaanb · · Score: 2, Funny

      And I agree, we all sleep better at night with Nagios around (except when I hear a whoop-whoop :)

      you have it sending alerts to the Police?! I'm not sure that's what the IT guys had in mind when they said '24 hr emergency response'. :)

    6. Re:Nagios is great by afidel · · Score: 1

      Hyperic looks cool but the support costs are insane, ~$800/2CPU's/year? I only paid around $1,600 for WhatsUp and pay a fraction of that per year in maintenance to monitor up to 1,000 devices. It's not perfect but I can deal with some warts for that kind of savings!

      --
      There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
    7. Re:Nagios is great by Anarke_Incarnate · · Score: 1

      There's a free version of the software. You lose some things like forming groups, etc but it is much better than "whatsup" as it can go on a per service/script scan. The Free version also loses out on the "remediation," as such that if you have a condition like "If apache is not up, run this cleanup and then start apache, and notify this group" Also, Hyperic can give you discounts and even charges half for non production environments.

    8. Re:Nagios is great by afidel · · Score: 1

      You can set the polling interval on a per service or monitor basis in WhatsUp as well. It also includes remediation in the base price (though it doesn't include any functionality other than restart service out of the box, but that's usually custom anyways). Their discounts would have to be like 90% to be cost competitive, do you have any idea what their site licensing costs are like? I don't feel like being pestered by a sales puke for information that should be right on their website.

      --
      There are 4 boxes to use in the defense of liberty: soap, ballot, jury, ammo. Use in that order. Starting now.
    9. Re:Nagios is great by Anarke_Incarnate · · Score: 1

      I have not used whatsup for years and even then, just a demo. I liked hyperic for the free version and in some environments, especially the OpenView and Tivoli ones, the $800 per 2 CPU would be about an 70% discount.

    10. Re:Nagios is great by Maniacal · · Score: 1

      We use GroundWork. They provide a graphical front end for the nagios configuration that takes a lot of the pain out of it. I think they only support Nagios 2 currently but we've been happy with it and it's free. They have VMware appliances as well which gives you a zero install deploy option making it even easier.

      --
      MG
    11. Re:Nagios is great by mosinu · · Score: 1

      Like all too many F/OSS projects, the documentation is lacking or even incorrect in spots, and supplied examples barely scratch the surface of what the application can do.

      Last time I checked you were free to update and contribute to that lack of documentation so many bitch about that is missing or incorrect....

    12. Re:Nagios is great by perldork · · Score: 2, Insightful

      Agreed, time and money spent on integrating Nagios into an organization (or any other free OSS product) to me is much better time and money spent than spending money on licenses and paying support people for a commercial product who then not only get your money but also get the benefits of the knowledge learned from the experience instead of your company or group getting that information.

      Even that wouldn't be sooo bad except that many commercial companies don't even share that knowledge in a way that other customers can benefit from unless they pay for consulting time ... most commercial NNM producers have horrid public forums and KBs that really only cover issues related to upgrades and licensing as opposed to lessons learned by other customers.

      This of course only applies to organizations that have development/IT groups that are large enough to support custom integration efforts, I understand that there are many places who can't afford to invest in in-house development or who really do not want to learn how to do systems/application/network monitoring themselves.

    13. Re:Nagios is great by perldork · · Score: 1

      Yes, there are the ugly monsters. Examples:
      * Tivoli, Openview, BMC anything

      Then there are the less expensive monsters that came out to beat the ugly monsters by being relatively cheaper and slightly more open and useful. Examples:
      * Spectrum, eHealth
      * SevOne

      Then there are the 'cheap but not so NOC friendly' products that came out to make basic monitoring easy. Examples:
      * What's Up Gold
      * Server's Alive

      Then there are the OSS projects that came out in reaction to the expensive less than open commercial projects. Examples:
      * Nagios
      * Pandora
      * Cacti
      * Big Brother
      * MonIT

      Then there are the products that try to hit the middle ground between free/OSS and commercial but more reasonable than the monsters. Examples:
      * Zabbix
      * Hobbit
      * ZenOss

      and more expensive but less than the monsters ... example:
      * Hyperic

      My list is not comprehensive by any means .. so companies and consultants have a lot of different migration paths they can use to get away from the very expensive, very stovepipe NMSs that used to rule the field of monitoring to less expensive ones with some custom work allowed to free ones that require a fair amount of custom work with the tradeoff of no licensing or overpriced support options (obviously once you go down doing in-house work you incur the cost of having in-house developers maintain the work unless your organization is willing to let you contribute back to the OSS community .. and even then you will still have in house support needed).

    14. Re:Nagios is great by Anarke_Incarnate · · Score: 1
      Having been in Openview shops exclusively, I can see how powerful, and yet frickin' misguided it is. Openview is used, not because of how powerful or elegant it is (it is one, but not the other) but because "HP makes it, and everybody uses it, so we are taking less risk."

      I have never been in an environment that actually used Openview and LIKED it, and the alerts they got. Often times, the operator at the console sees a scroll of gibberish and just recognizes that the reds and oranges are meant to be that way , and ignores them for the most part, EXCEPT when they are meaningless.

      I have been called at 3AM because the status of a node changed from NORMAL to NORMAL.......

      However, when we lost 90% of our data center to over heating from a total A/C failure......well, they just assumed an outage that big was a scheduled one they were not made aware of and ignored it, due to it filling their screen up......

      All the NMS software in the world can't fix lazy, stupid or overwhelmed.

    15. Re:Nagios is great by perldork · · Score: 1

      Having been in Openview shops exclusively, I can see how powerful, and yet frickin' misguided it is. Openview is used, not because of how powerful or elegant it is (it is one, but not the other) but because "HP makes it, and everybody uses it, so we are taking less risk."

      Great point, there is a very interesting anti-pattern that i have seen happen in several larger organizations with commercial products like HPOV or Tivoli:
      * Someone goes for the self-healing, auto-discovers everything hype that commercial monster product marketing types throw at potential customers
      * Company buys software
      * Company pays consultants to help install and configure it
      * COmpany pays big bucks for a support contract
      * Company pays for a handful of employees to get training at some level

      So already our company has a big investment in the software and they haven't even done anything real with it yet.

      * Company gets to be very comfortable with the software and used to it's quirks
      * Company discovers "Ah, it does X and Y, but we realllly need Z to be covered"
      * COmpany shells out more and more $$$ to learn to customize OR starts paying in-house developers to work around the limitations

      So now a rational and reasonable manager or employee says "Hey, isn't this costing us waaay more money than P or Q or R or even S, or even OSS T if we did it ourselves?"

      By this time the company is so deep financially into the product that they are afraid to tell higher ups they need to change for fear of looking like they just wasted a ton of money on something that really doesn't meet their needs.

      Laziness kicks in as well as does a strong desire not to have to go through that same integration process with another large product and another large company .. so the cycle continues :).

      This doesn't happen all the time, but is something I have seen a number of times.

      So, back to Openview NNM for a second:

      I was impressed that with Openview NNM 8 they moved to using PostgreSQL and JBoss, but then lost that 'warm glow' when I found out that the basic NNM still just does basic SNMP tests for nodes and no performance tests .. and then got a cold feeling when we found out that the performance add on (SPi) for it that lets it do basic network metrics (in/out bits, errors, collisions) has to run on a Windows host! Doh! And then the reporting uses Crystal / Business Object bloatware .. gross!

    16. Re:Nagios is great by Anarke_Incarnate · · Score: 1
      They got rid of VPR? I was using a very old OV on Solaris 8 or 9 (280R). I never got the love for SPARC but I am a Linux and HPUX guy, myself. I wish I knew Solaris better, but it always felt so different just for the sake of being different.

      The other thing I found with OV was that it never gets set up properly. Often alerts are suppressed that are not meant to, things get missed, or it gives you so much to wade through you may as well miss alerts. The issue is not just the software, but the methodology. The people who actually NEED the alerts have to get them from people who are often paid screen watching monkeys who have little to no idea what any of it means, and are paid poorly and treated poorly and then don't do a good job.

      My company is now paying HP to come back and tell us how to improve our process, meanwhile if they actually listened to the employees and contractors, they would know exactly what we need to do. Oh well, guess we need to go fill out a code for that (Man, do I hate having to do stupid arbitrary paperwork for the most mundane crap).

    17. Re:Nagios is great by secolactico · · Score: 1

      You lose some things like forming groups

      That's actually a pretty big loss, if you ask me.

      I'm no fan of Whatsup, but I use it and it's quite flexible when it comes to monitoring and "remediation" if you dabble on jscript or vbscripts.

      I tried Zenoss and found it superior to Hyperic (imho), but sorely lacking in documentation and not so active community forums. This was last year, I guess it's time give it (and OpenNMS) another try.

      Is there an opensource product that centralizes network and systems monitoring? Every product I've tried leans toward one end or the other. And things like netflow/sflow are a bear to implement with the current tools.

      My monitoring needs are currently met by Nagios, Cacti (several of each), Whatsup, Orion and Zabbix (hooray for lightweight monitoring agents!) Of course this leaves a mess of "dashboards" to check when trying to get a birds eye view of the network's health.

      --
      No sig
    18. Re:Nagios is great by Jellybob · · Score: 1

      It's even more useful in a large environment - our setup currently monitors 1900 pieces of hardware, each with a few services on them. Without some sort of monitoring system, it would be near impossible to notice if one of those goes down, especially if it's customer hardware which people in the company aren't using.

    19. Re:Nagios is great by perldork · · Score: 1

      Nice points about set up; this is a major miss in a lot of NNM setups, irregardless of choice. Somehow companies do not get or forget that an NNM really is a communication proxy and filter:
      * Communicate events to people *who can do something about the issue* without having to have a person perform a mindless, repetitive check over and over.
      * Make it easier to store and view historical performance data and alert data for the purposes of knowing when to scale up or replace hardware or focus on software that is causing problems.
      * Help close the communication gap between those who manage hardware, those who create software, operations and maintenance staff, and management.
      * Make it easier to communicate system, application and network performance and availability to managers, non-technical people in an organization, and customers.

      Some companies also seem to forget that before NNMs people had to do manual checks, and when that was the case, because that was an expensive operation, you would only have people checking the most important indicators of health for the system, application, or device or you would be wasting precious paid people time :P.

      So many companies mistakenly think that doing what you describe is enough to make for a meaningful system:
      * Monitor everything that can be monitored
      * Throw up alerts on a big screen
      * Send emails

      You hit the nail on the head about methodology.

      Most software packages can be configured to do the right thing for an organization, but the organization has to invest time in people still to get that done :p. Having 3 junior people doing 'eyes on glass' as thousands of events occur for systems or domains of expertise they have not been trained in is a waste of money and only creates frustration and a negative attitude about the value of automated / semi-automated monitoring.

      Like you, too often I have seen a company think that network management software is the solution rather than it being treated as the glue that helps the people in a company communicate effectively about applications, systems, and networks.

  3. not good. by Lord+Ender · · Score: 2, Insightful

    Is it extensible? Is it easy to use? I didn't get it the first time, better repeat it a few more times...

    My personal experience is that Nagios is probably the LEAST easy to use of any piece of software, period. I hope they changed it in a major way, because last time I tried to use it I was forced to dig through configuration files and learn syntax just to get the thing to see if some server was responding to pings.

    --
    A slashdotter who didn't build his own computer is like a Jedi who didn't build his own lightsaber.
    1. Re:not good. by Anonymous Coward · · Score: 2, Informative

      Just get a good front end for nagios, like Groundworks open source. That will make configuration loads easier. (posted as ac 'cause my password is so good I can't remember it)

    2. Re:not good. by walt-sjc · · Score: 3, Insightful

      Oh please. It's NOT THAT HARD!!!! For what it does, it's fairly simple actually. Compared to any other package of similar capability, it's quite average in terms of difficulty actually. No worse than something like Exim or Apache. Just think of each server as a vhost and each service as a location directive

    3. Re:not good. by amorsen · · Score: 1

      Nagios is probably the easiest to use network monitoring system. That doesn't mean it's particularly easy, just that the others are worse. It breaks down when the network has trouble though; if a significant number of host are unreachable it takes forever for nagios to figure it out. That tends to be exactly when you need it the most.

      --
      Finally! A year of moderation! Ready for 2019?
    4. Re:not good. by Qzukk · · Score: 1

      forced to dig through configuration files and learn syntax

      As opposed to what, punching the numbers into a pretty little GUI like one of many?

      --
      If I have been able to see further than others, it is because I bought a pair of binoculars.
    5. Re:not good. by mindstrm · · Score: 3, Informative

      If all you want is a tool to ping a few servers, nagios is overkill.

      My gut reaction is that if nagios configs seem too complicated, you likely have never had to roll out real enterprise monitoring.

      Our Nagios install monitors thousands of things, many of them custom tests.
      (Transaction volumes, application response times, cron job status, files....).. it can be made to to be the focal point for all the "stuff" the people responsible for monitoring company IT operations need to know about.

    6. Re:not good. by Anonymous Coward · · Score: 0

      Take a look at Zabbix, then tell me nagios is simple.

      Unless Nagios provides a feature you need that isn't in Zabbix (escalations? Log monitoring?) I'd strongly recommend it

    7. Re:not good. by pak9rabid · · Score: 1

      Check out GroundWork. It's basically Nagios + a fairly easy-to-use web interface. We've been using it up at my work for over a year and it works great.

    8. Re:not good. by isorox · · Score: 3, Insightful

      Is it extensible?

      Yes, what can't you monitor with nagios?

      Is it easy to use?

      You should see our 2nd line people, if they can use it, anyone can.
      1) Big red problem appears on page
      2) They click the link to the logging system which does an asset-based search showing recent problems.
      3) They click the link to the wiki page for that host, which hints at how to fix it.
      4) Red thing goes away

      There's a difference between *use* and *configure*. Nagios is the easiest monitoring system we've ever used in our department. It's pretty easy to configure too when you know what you're doing (one config file per device host, one directory per logical division of devices, one perl script to splat out the devices, one subversion repository to version track everything)

      I hope they changed it in a major way, because last time I tried to use it I was forced to dig through configuration files and learn syntax just to get the thing to see if some server was responding to pings.

      So? What use is a monitoring program that tells you that. If you want to do decent monitoring, you want to monitor the systems, not the devices those systems happen to run.

      It's a steep learning curve, but have you ever configured apache from scratch? Let alone bind or sendmail.

    9. Re:not good. by dubl-u · · Score: 1

      Oh please. It's NOT THAT HARD!!!! For what it does, it's fairly simple actually. Compared to any other package of similar capability, it's quite average in terms of difficulty actually. No worse than something like Exim or Apache.

      The difference with something like Exim or Apache is that the tricky concepts you need to understand are mostly external constraints. SMTP is weird and complex. Serving files via HTTP and connecting to web apps is slightly less weird, but much more complex.

      A basic install of Nagios, on the other hand, is doing something pretty simple and straightforward. But at least with the 2.x series it was an unnecessarily giant pain in the ass to configure because you had to understand the Nagios-specific way of looking at the world and handling configuration. It may be easy once you know it, but there's a steep learning curve to get you that far.

      Now I'm not griping. It is free. Although I was tempted a couple of times, I never quite got around to fixing it or building a competitor. But if I had to talk somebody through setting up Apache or Nagios over the phone, I'd rather do 5 of the Apache calls than 1 of the Nagios calls.

    10. Re:not good. by glitch23 · · Score: 1

      My personal experience is that Nagios is probably the LEAST easy to use of any piece of software, period. I hope they changed it in a major way, because last time I tried to use it I was forced to dig through configuration files and learn syntax just to get the thing to see if some server was responding to pings.

      I hope this isn't considered too off-topic but, to help your situation, have you looked at Hyperic HQ? It was previewed about a year ago in Linux Journal. We are using it at work (Enterprise edition) and paid for support but both non-enterprise and enterprise versions are open source and the non-enterprise edition is free. They charge by the # of agents you deploy. It can collect simple SNMP data or, using their Java-based agents, can collect even more metrics for operating systems. It works on Windows, Linux, Solaris, and Mac OS X. It automatically detects services running on a system (IIS, Apache, WebLogic, etc.) and monitors their performance and status. Alerts can be created, it supports integration with a directory server (ADS, Sun Directory, etc.), provides the ability to assign roles to users and put devices into groups. And it is all free if you don't want support. There aren't even any imitations with the free version if I recall correctly. It can use it's own database or you can install a separate DB. About the only thing it doesn't do is give you a network map but considering it does everything else for you basically automatically (except some SNMP setup) it is worth it. Basic availability info is performed by the agents so you wouldn't even need SNMP for servers. Check it out.

      --
      this nation, under God, shall have a new birth of freedom. -- Lincoln, Gettysburg Address
    11. Re:not good. by Auntie+Virus · · Score: 1

      My personal experience is that Nagios is probably the LEAST easy to use of any piece of software, period.

      You've obviously never installed/configured RT3. I used to use Big Brother, tried Zenoss after reading about it, then tried Nagios. Nagios totally rocks.
      It's not THAT hard to configure. We use Nagios, Snort, Ntop and RT3. Hard to say how much money I saved by not using any of the big commercial products, but it's a lot. But RT3, crap, THAT is hard to get going..

      --
      Why yes, I *AM* new here. Why?
    12. Re:not good. by WuphonsReach · · Score: 1

      My personal experience is that Nagios is probably the LEAST easy to use of any piece of software, period. I hope they changed it in a major way, because last time I tried to use it I was forced to dig through configuration files and learn syntax just to get the thing to see if some server was responding to pings.

      Nagios is not that difficult, especially v2.

      The key to a good Nagios rollout is to start small. As in, a few contacts, a few services, and a single host. Learn what the various objects are. Put the configuration files under version control (extremely useful, try FSVS if you're using Subversion in your company). You should be up to speed and have a good grasp of how everything fits together in under a week.

      After that, Nagios configuration is mostly about organization. Don't keep everything in a single file, or even a single directory. Create a directory for templates, another for contacts, split your hosts across multiple directories by some loose classification. Put each host configuration in its own file.

      --
      Wolde you bothe eate your cake, and have your cake?
    13. Re:not good. by Lord+Ender · · Score: 1

      Apache, BIND, and Sendmail are not easy to configure. If someone were hyping their "ease of use" on here, I would criticize them, as well.

      --
      A slashdotter who didn't build his own computer is like a Jedi who didn't build his own lightsaber.
    14. Re:not good. by isorox · · Score: 1

      Apache, BIND, and Sendmail are not easy to configure. If someone were hyping their "ease of use" on here, I would criticize them, as well.

      Yet all three are used by your grandma every day, they are very easy to use. Easy to maintain too.

    15. Re:not good. by mindstrm · · Score: 1

      And you know, to qualify that... there is definitely a level of enterprise monitoring way above and beyond what nagios provides as well. It's far from the final solution for monitoring.. it's all a matter of scale.

  4. Cacti Users by cmorford · · Score: 2, Informative

    I've used Nagios, but found Cacti and haven't turned back. Any other Cacti users out there? I found Cacti to be much easier to setup than Nagios and fairly extensible for the advance user.

    1. Re:Cacti Users by Anonymous Coward · · Score: 1, Informative

      Try ClearSite... it's what Cacti is to MRTG. http://clearsite.sourceforge.net/coming-soon.html Linux only for now, but the developers are very nice and will share a newer version if you contact them. -theWiseWan

    2. Re:Cacti Users by amorsen · · Score: 1

      Cacti isn't very useful for alerting, just as Nagios really doesn't work well for graphing (one of its more annoying shortcomings).

      --
      Finally! A year of moderation! Ready for 2019?
    3. Re:Cacti Users by Fez · · Score: 1

      We use both. Cacti for graphing, Nagios for monitoring and paging.

      Nagios 3 did change a bit for the better. However, because they removed MySQL support I had to rewrite large portions of the existing configuration.

      In the process, I made much better use of templating and now each host config is in its own file, and Nagios will load all files in given directories thanks to the cfg_dir= directive.

      For example, all of my servers are in etc/nagios/servers/(servername).cfg, routers are in etc/nagios/routers/(routername).cfg, and so on.

      If I want to add a server, I just pick a similar one and copy the file, change the name/IP/services/etc, and reload Nagios. With the older config (held over from Nagios 1.x) I had to edit half a dozen files or more just to add a single server. Thankfully those days are over!

      Some of these abilities may have been in Nagios 2.x, but because the old config "Just Worked" it was not changed.

    4. Re:Cacti Users by thanasakis · · Score: 4, Informative

      You are comparing apples with oranges, nagios is for service monitoring, cacti is for diagrams.

    5. Re:Cacti Users by sarabob · · Score: 1

      I found Cacti a nightmare to configure, setting up custom graphs is comically complicated (why can't I just use rrd syntax rather than clicking buttons?) and we always end up with three data sources for the same things. Support for SNMPv3 is patchy, and we needed to jump through hoops to get the graphs to cope with multiple cpus (cpu usage over 100%? It wraps back to zero...).

    6. Re:Cacti Users by cmorford · · Score: 1

      Not necessarily. You can install the monitor plugin (cactiusers.org) to cacti and get all the alerts you want. While i suppose it doesn't come installed by default, it definitely combines graphing and alerting into a single package that works well.

    7. Re:Cacti Users by Fez · · Score: 1

      I've had similar problems with trying to make a custom graph in Cacti. For example, in MRTG, to make a graph that simply added two OIDs, you just set the source to OID1+OID2, and you're done.

      Just try doing that in Cacti. You'll learn more about graph templates and CDEFs than you ever cared to know in the process...

    8. Re:Cacti Users by mindstrm · · Score: 1

      A big install would use both.. as they are very different tools.

      Nagios is a monitoring & alert framework.

      Cacti is a graphing framework...

      Does cacti have some ability to do problem detection, notification, escalation, acknowledgement, resolution, and trend reporting?

    9. Re:Cacti Users by cmorford · · Score: 1

      To some extent yes. There are some plugins at cactiusers.org, one being a monitor plugin. Now it may not be as extensive and nagios, but it's a start.

    10. Re:Cacti Users by Anonymous Coward · · Score: 0

      Fucktard ;

      cacti is just a graphing tool with some extras

    11. Re:Cacti Users by Fweeky · · Score: 1

      Tried Munin? I was quite impressed when I installed it and found it'd auto-detected a whole bunch of locally graphable stuff.

  5. Why not? by Drakkenmensch · · Score: 1

    Some of us still have the iBall legacy reader for the .DTF (Dead Tree Format) file type!

  6. No way to send mail with link to SLA report by Anonymous Coward · · Score: 0

    I gave up using Nagios when I found that there was no way to send an email to management linking to a preprepared (by month) Nagios SLA report.

    What's the point of having a great monitoring tool if you can't share the reports in an useful and practical manner.

    So I switched to Zabbix. Not happy with that either.

    1. Re:No way to send mail with link to SLA report by Neil+Watson · · Score: 1

      You might look at Opennms or Zenoss. The key to choosing the best monitoring service for you is to clearly define what you want to monitor and how you would like that information presented.

  7. If you feel Nagios is too difficult by guruevi · · Score: 1

    Try Pandora FMS. It does the same as Nagios, is open source but only requires a minimum knowledge of shell scripting to get it working and can monitor everything you can think of inside (using an agent) or outside a host. I monitor about 100 hosts with it and have about 1200 data points every 5-10 minutes (temperatures, network packets, processes etc.) but it scales much larger (using MySQL as backend) even on simple hardware.

    --
    Custom electronics and digital signage for your business: www.evcircuits.com
  8. Monitor my gf by Anonymous Coward · · Score: 1, Funny

    I actually managed to get a girlfriend. And she is a real hottie. I don't want to sound paranoid, but I need to monitor her to make sure she is not cheating, and keeping her AV up to date, you know what I mean.

    So when I read this, "no device or situation has yet been found that cannot be monitored using Nagios and a pre-made or custom script, plug-in or enhancement", I thought this would be perfect.

    I think a Nagios plug in would be best, preferably something with sharp blades. So how do I install this in my gf, and have Nagios monitor it?

    1. Re:Monitor my gf by yukk · · Score: 1

      I think the device you're looking for is known as a chastity belt but you'd need to couple it with a personal GPS tracking device Or you could just hire a Private Eye.
      Of course, whether she continues to like you after all this is not my responsibility.

      --
      The trouble with the rat race is that even if you win, you're still a rat." Lily Tomlin
    2. Re:Monitor my gf by dfn_deux · · Score: 2, Informative

      I understand that your comment was made in jest, but.... Nagios is a really flexible polling and alerting framework. There is nothing in nagios that makes it specifically tailored to monitoring computers or services. For example, there is no reason why you must use the HostAddress directive to hold an IP or a hostname, it could just as easily be a street address, phone number, SSN, etc... And like wise there is no need for polling to actively poll, you can just as easily configure nagios to only respond to passive updates. So, just for the sake of argument, if you really wanted to use nagios to track/control a human's actions and movements you could combine passive monitoring by having an investigator follow your target and supply them with either a phone number, email address, or website where they could submit a "check result" while at the same time you could do active monitoring by utilizing any number of GPS/cellular logging devices combined with a small analysis script with some thresholds. If you wanted you could even use the output of the gps to update the relative location of "nodes" on your status map... I believe one of the examples in the documentation has phone numbers for local pizza places used as HostAddresses and has a dial out script to check the average rings to answer for phone availability validation.

      --
      -*The above statement is printed entirely on recycled electrons*-
    3. Re:Monitor my gf by isorox · · Score: 1

      I think the device you're looking for is known as a chastity belt but you'd need to couple it with a personal GPS tracking device

      Or you could just hire a Private Eye.

      Of course, whether she continues to like you after all this is not my responsibility.

      Indeed, Heisenberg did say that you the act of monitoring something will impact it

  9. Nagios can die in a fire by Anarke_Incarnate · · Score: 1

    Long live Hyperic. Free and extensible without the nonsense to set up that is Nagios. You can use Nagios plugins for it if you so wish.

    1. Re:Nagios can die in a fire by druiid · · Score: 1

      Hyperic is incredible.. but for my uses I need the enterprise version. Paying an extremely high amount of money for only 25-30 servers is not in the cards... and thus I chose zabbix, which does enough right to be a good replacement.

    2. Re:Nagios can die in a fire by Anarke_Incarnate · · Score: 1

      The non enterprise can be used in a 30 server environment, but you DO give up a lot of functionality and have to re-invent things for it to work. I just wish they open sourced the lot of it, but unlikely

  10. Slashdot Book Review Template by rhizome · · Score: 3, Informative

    1st Paragraph: Paraphrase of Foreword.
    2nd Paragraph: What the initial chapter(s) is (are) about.
    3rd Paragraph: What the next chapter is about.
    4th Paragraph: What the chapter after that is about.
    5th Paragraph: What the last chapter(s) is(are) about.
    6th Paragraph: Pithy criticisms for balance.
    7th Paragraph: Conclusion with the required, "This book is useful if you are like me" statement, as in, "Overall, this is a great book for anyone using Nagios as more than a casual user, and is still very informative for the casual user."

    --
    When I was a kid, we only had one Darth.
  11. I have this book, it is not impressive. by hax4bux · · Score: 5, Insightful

    This book is not a big leap over the supplied Nagios documentation. I bought it out of guilt, but I doubt I have gotten my moneys worth. This is not so much a criticism of the book as praise for the supplied documentation (which is rather decent, given the topic).

    Getting Nagios (or OpenView or whatever management system you have) working is a big job which will not be solved w/a $40 book and a afternoon.

    For all of you who complain about Nagios being complicated, I hope you never see OpenView (et al).

    If you haven't seen Nagios, there is a daemon which performs the collection. The UI is browser based (Apache HTTPD CGI applications). Sometimes there are agents on remote machines to collect status like process tables, disk utilization, etc.

    Nagios is essentially a job scheduler/messaging system. Monitoring is performed by invoking little programs dedicated to collecting information, and these are easy enough to create. There are lots of hooks if you need to extend the system.

    Since the UI is owned by HTTPD, so is access control. Who doesn't know how to set up LDAP or a auth file for Apache? Most of the CGI plugins are implemented in C and are not ugly to look at.

    The agent issue is a little clouded because there are many agents to choose from. I usually just use the Net-SNMP agent because I have a lengthy SNMP background, but that is just my personal choice.

    I will stop here since the article is about a book and not Nagios. I merely wanted to address some of the criticisms of Nagios.

    1. Re:I have this book, it is not impressive. by tcopeland · · Score: 1

      > I bought it out of guilt, but I doubt I have gotten my moneys worth.

      I usually figure if I get _anything_ at all out of a book than it's worth the price. I just bought a Puppet book and just having it around for occasional skimming has gotten me familiar enough with Puppet that I'm willing to give it a whirl. And for $17.99, meh, good enough.

    2. Re:I have this book, it is not impressive. by BitZtream · · Score: 2, Insightful

      For all of you who complain about Nagios being complicated, I hope you never see OpenView (et al).

      I used to run an OpenView server ... my god, getting that thing to do useful stuff was like getting a cat to listen to your commands, it can be done, but why the hell bother.

      Since that job, I've come to love Nagios (which is still complicated) because its about a billion times easier to deal with than OpenView. Nagios IS complicated, but its job IS complicated and Nagios does a hell of a job when compared to something like OpenView.

      I've found however the best way to monitor servers is to just put your cell phone number in a nice public place, you get practically instant notification of a problem, sometimes you get notification years before the problem exists! You even get notification of problems completely unrelated to your services/network, heh.

      --
      Persistent Volume manager for Kubernetes - https://github.com/dwimsey/openshift-pvmanager
  12. I like Zabbix by Anonymous Coward · · Score: 0

    Nagios was ok but the mapping on Zabbix is way better. I have it running in a virtual server on top of ubuntu. works great! Very flexible, can create your own icons, the graphs are easy to setup..

    We used what's up gold before. which has a nice interface but when you consider buying a server license, buying what's up for $2,500 for 100 devices, and then dealing with ms sql, Zabbix wins.

  13. not in stable portage by Anonymous Coward · · Score: 0, Troll

    ... come back to me when nagios v3 is marked stable. :P

    1. Re:not in stable portage by Darby · · Score: 1

      ... come back to me when nagios v3 is marked stable. :P

      Unmask it, it works fine for me on x86_64. I upgraded over a month ago, no issues and many improvements.

    2. Re:not in stable portage by jaydubscott · · Score: 0

      umm...from the website (http://www.nagios.org/)

      Latest versions:
      Nagios 3.0.3 (stable)

  14. Nagios is well knows in bigger enviroment by Krneki · · Score: 1

    If you environment is big enough so you can employ at least 1 person to fully work with Nagios, then it's a great product. But out of the box it needs too much time to become usable. I'm talking of Nagios 2, I have no experience (yet) with Nagios 3.

    --
    Love many, trust a few, do harm to none.
    1. Re:Nagios is well knows in bigger enviroment by thegrassyknowl · · Score: 1

      Nagios is good, but people have gotta start learning its limitations. I recently brought up the issue of security with some people and their answer: Nagios... There is the right tool for the right job, and Nagios is one small tool in the admin's arsenal to solve one problem. It's being touted as a universal panacea by those with little real knowledge and it's a little scary.

      --
      I drink to make other people interesting!
    2. Re:Nagios is well knows in bigger enviroment by Fweeky · · Score: 1

      We use Nagios in a 2 man team monitoring about 30 hosts and 200 services, including quite a few custom ones. It's not that hard, once you get used to how it works.

    3. Re:Nagios is well knows in bigger enviroment by WuphonsReach · · Score: 1

      If you environment is big enough so you can employ at least 1 person to fully work with Nagios, then it's a great product. But out of the box it needs too much time to become usable. I'm talking of Nagios 2, I have no experience (yet) with Nagios 3.

      We use it for about two dozen hosts, a few hundred services, and a handful of technical support users.

      All of which took about a week to get up and running. But other then small re-configurations when we move systems around or change the network, it's very much a "setup and not worry about" tool.

      Such a great tool - we know there's a problem before the users do (usually). So when our phones start ringing, we're not scrambling in the dark wondering what is actually broken. It gives upper management the impression that you're on the ball and have your finger on the pulse. So even when things break, being able to tell your boss that you know what is broken and you're already working on it can pay off.

      Plus there's a side-benefit... can we set things up well enough that we never see Nagios alerts more then once a week? It becomes a bit of a game.

      Network and server monitoring is critical for a small, overworked support staff. Setup warning thresholds, and you'll never be surprised about a disk running out of space, or a server running out of memory again. Or at least, you'll be warned in advance and have a bit of time to address the issue before it takes something down. Small staff don't have the time or people to sit and check on things like disk temperatures, free space, server loads, process counts, size of the mail queue, etc.

      It's a lot like any other tool. The more you put it to work, the more you'll value it.

      --
      Wolde you bothe eate your cake, and have your cake?
  15. OpenNMS is better by viridari · · Score: 2, Informative

    I don't know why OpenNMS doesn't get more credit, maybe because it's a Java app, but it's a damned good one.

    Get a very basic OpenNMS configuration going, point it at a range of IP addresses, and it will auto-discover most of what's out there. If you've got your SNMP agents up and running properly, it'll automatically start checking the more important OID's for you and graphing them with an RRD back end. Most of the setup can be done through the web interface instead of through vi. You don't have to restart the daemon every time you add a node.

    If Nagios drives you a bit batty, check out OpenNMS.

    1. Re:OpenNMS is better by Elshar · · Score: 1

      I've been wanting to try OpenNMS for years now, but it doesn't work out of the box with FreeBSD (all the java dependencies, etc..) and I could never actually get it to compile properly even after fiddling with it. It's really a shame too, I've only heard good things about it and REALLY tried to get it working with my current system (All *BSD boxes). Maybe someday I'll get a linux box that will work with it, but as of now I already have a network monitoring box with nagios, cacti, etc on it.

  16. Nagios has much better competitors by druiid · · Score: 2, Interesting

    I used nagios for years.. many many years. It has to be, as many have already pointed out.. the most difficult to configure OSS project ever made.

    That said, it was fairly powerful once configured properly.

    The thing is, though, that is has many shortcomings. I found a much better (although not necessarily as scalable) monitoring and data-gathering solution in Zabbix. They recently released a new version as well that adds many really nice capabilities like ipmi support.

  17. Hobbit: spin off of Big Brother by djs1w · · Score: 1

    I currently monitor about 250 hosts with Hobbit (http://sourceforge.net/projects/hobbitmon) and have had good success with it. It has trending (RRD graphs) and alerting thresholds (ie 85% full, email, 95% full pagers) built in together. It is also customizable. We have created several perl scripts that check random applications for various things that are also tracked with Hobbit. How much data Netbackup backed up last night. How many users are logged into our portal. Are the tape drives within Netbackup up or down? The list goes on and on. It runs on Unix and Windows, although the Windows client isn't as robust.

    --
    There is no such thing as secure systems, only secure admins.
  18. Zabbix by kosmosik · · Score: 4, Interesting

    I like Nagios but I can't really imagine how to apply it in large (think ten thousand hosts) setup in multiple regional/organizational branches and so on.

    Also Nagios *is* painful to setup. First of all AFAIK there is no way to delegate administration f.e. to organizational branches. Configuration is just a big pile of config files included from some other config files etc. There is no autodiscovery/autoconfiguration of hosts since Nagios team belives it is BAD etc.

    Well IMHO Nagios is grat but it is like, a big fat pile of hacked scripts and configs. Not too elegant but working.

    Now... I am (well we are in my organization) using Zabbix and I find it great. It is much better organised/elegant than Nagios.

    In Zabbix architecture you have well designed atomic elements like checks, items, services (groups), etc. It also gathers fine tuned historical data for trends and historical review. You can compact the data (lower the resolution) after a given time and so on. It is in fact a very complete monitoring framework with its own internal condition language, escalation engine. You can gather data from network checks, SNMP, custom scripts, Zabbix agents (aviable for most platforms) etc.

    And it has normal configuration, not crude text config files. I have nothing against text files but sometime I don't really want to open my text editor only to quickly setup an ad-hoc overwiev screen with maps, graphs, status displays, clocks and you can have few screens of such rotating on your big screens in NOC. All with mouse clicking.

    I can give it as a tool for sysadmin and he or she can work with it without having to study manuals. Not everybody in your organization is an unix hacker you know...

    We have dozens of branch servers which are managed by local sysadmins and a farm of central servers which is managed by central staff.

    Zabbix works in distributed manner so a local branch can have very detailed view on their infrastructure and at central level I can have an functional/business overview of entire infrastructure, core services (like business systems, transactions etc.) Not just simple checks if RAID is OK - I don't care if RAID in some server is OK. I need to know why (where, who to blame) given service (be it MQ/WebSphere) is not working as desired.

    And also it is free, open source and aviable in most linux/unix distributions as a standard package. So when considering enterprise monitoring platform do yourself a favour and also check Zabbix.

    http://www.zabbix.com/downloads/ZABBIX%20Manual%20v1.6.pdf

    1. Re:Zabbix by knarfling · · Score: 1

      I'll second Zabbix. It has gone through some growing pains, but I like it for its ease of use as well as its flexibility. Until this last version, it did not have good escalations or repeat notifications, which was a big problem. However, with 1.6, that has been corrected.

      One of the things I like about Zabbix was the ability to write custom checks. If you could get any script or program to spit out data, you could very easily capture that data and run checks on it. The windows client could read Windows Performance Counters, so a TON of custom checks were easily written. In my last job I used it to monitor an incoming feed from another company. If I didn't receive info from the company for 20 minutes, I could send out alerts for someone to check the feed. I am sure that could be done with Nagios, but it was much, much easier with Zabbix.

      --
      Great civilizations have lived and died on false theories. Don't mess up mine with a few facts.
    2. Re:Zabbix by kosmosik · · Score: 2, Informative

      Well for me what ruled out Nagios was:

      1. It is painfull to setup, don't get me wrong - I've sat my time over configuration and I think I know it a little bit and I can easly set it up for like 100 hosts with some templates +includes +sed magic. But that is what I can do. Not all of my staff can do it and it really is not easy.

      2. It is not distributed. The checks can be distributed. But you cannot have like 20 child Nagios nodes managed by local staff and parent nodes that gather data from children. This is a killer feature of Zabbix for me. I can send out a standard configured box/server with Zabbix to my local staff. Give them access via LDAP/AD. And tell them to configure it so it suits *their* local setup (well we have quite uncommon/unstandardized branches - historical/political reasons). Then I can gather data from their local system (they have configured it) and process it in central place so I can have a clear overview what is going on in infrastructure. I really have no clue on how to do it with Nagios - probably it is possible with some ninja-like-hacking but it is not something (ninja-like-hacking) you like for big organization. You need a clean, managable stuff.

      3. Zabbix can collect and really process historical data. If for some reason I wish to know how in past year my network bandwith evolved I can quite easly click and get some nice graphs, reports and even prognose some stuff based on various trends.

      To summarize Nagios for me seems like perfect tool for sysadmin. But it is not so good for enterprise monitoring where you have quite different goals.

    3. Re:Zabbix by Anonymous Coward · · Score: 0

      I have to agree, Zabbix is by far one of the best *sysadmin* tools I have ever used.

    4. Re:Zabbix by secolactico · · Score: 1

      I'll second Zabbix. It has gone through some growing pains, but I like it for its ease of use as well as its flexibility. Until this last version, it did not have good escalations or repeat notifications, which was a big problem. However, with 1.6, that has been corrected.

      As a current user of Zabbix who happens to like it despite perceived shortcomings, I have to say that tears nearly came to my eyes when 1.6 was released. It even has a dashboard which was the feature I had missed the most from Nagios (when you have more than a handful of systems the overview screen gets quite crowded).

      --
      No sig
    5. Re:Zabbix by ravydavygravy · · Score: 1

      I tried out zabbix this morning, as I'm always keen to try and improve our monitoring (we currently run a nagios 2.x system).

      In-house, we only use rpm'd software, for ease of upgrading and maintenance, so I quickly found RHEL4 rpms for the various bits and loaded them up. Thats when the pain started - I found the post-rpm setup horribly broken, and gave up after an hour or so. The web frontend had to hacked to display, then the DB connection script had to be hacked for it to recognise the DB (which I also had to manually install). After all this, it started moaning about permissions, so i said f*** it.

      Now, a lot of these issues may have been with the person who packaged the rpms, but for me it was a non-runner - building from source on my production servers doesn't happen.

      I don't find nagios hard to configure - in fact all our admins (team of 15) find it ok, and only half of them are real linux admins, the rest are win32 background. Its object based config scheme seems pretty logical to me....

    6. Re:Zabbix by kosmosik · · Score: 1

      Just what kind of argument is it? You've installed RPMs from unknown source on production server and complain that the RPMs are broken... Quite silly really.

      If you still wan't to check out Zabbix and you use RHEL I recommend EPEL RPMs packaged by Fedora Community. They work fine for me:

      http://fedoraproject.org/wiki/EPEL

      But there is no automatic database creation script since no sane admin need such thing.

      Instalation is as simple as:

      1. Install RPMs via YUM or by hand.
      2. Create database user and database with proper privileges (RTFM on how to do that).
      3. Load initial database schema from supplied SQL file.
      4. Configure zabbixd and zabbix_web (PHP frontend) - you need to supply database creditentials and that is bassically it.
      5. Start the demons and you are done. From now you can point and click most of basic stuff.

  19. Rather use a hammer than a rock by Anonymous Coward · · Score: 0

    By the very nature of network administration the associated tools are going to be complex. Nagios offers functionality and control options that are appropriate given the type of operations for which most admins are responsible. I'm looking forward to digging into V.3. Like the man said, don't hate the hammer hate the house!

  20. Zenoss by msimm · · Score: 1

    If you haven't already, take a look at Zenoss. Aside from having a pretty well designed UI (which as I get older I'm beginning to feel deserves more credit in the usefulness dept), supports SNMP by default (I'm not a big fan of clients unless I REALLY need them) *plus* it supports Nagios plugins.

    And I'm not trying to steal any thunder here, I think Nagios is a great option.

    --
    Quack, quack.
  21. Third on Zabbix by Anonymous Coward · · Score: 0

    I had put in a comment earlier about zabbix and it seems to be missing.

    anyway, I third on zabbix. It works great and is very flexible. I really like the mapping!!

  22. I use PandoraFMS by Anonymous Coward · · Score: 0

    I've been using PandoraFMS 1.3.1 and now pandorafms 2.0 and for me it's the best monitoring tool nowadays.
    Nagios, for me has a very poor reporting which is not helpful whatsoever, and the interface is awful.
    What I like of pandorafms is the graphs and how easy is to monitor a host from the webconsole, nagios it's horrible to setup.

    my 2 cents

  23. Nagios is a steaming pile of doggy-do by Anonymous Coward · · Score: 0

    no, it really is the biggest pile of cow crap I've ever seen in my entire life.

    it sucks ass worse than a hoover. I'm not joking.

    get a *real* monitoring system

  24. Nagios documentation by kosmosik · · Score: 1

    Right now I wanted to check Nagios documentation for simple thing - configuration file syntax. This is the basic stuff. It is the first thing that should be defined in reference manual. I like to know how the files are processed. How do I do comments. How do I define multiple line commands and so on.

    Please point me out that I am blind or stupid since I really cannot find it in manuals here:
    http://nagios.sourceforge.net/docs/3_0/toc.html

    Also I find the online manual quite retarded/clunky. It doesn't even has a search! I wonder why they havent use somekind of wiki (any serious wiki system has search) or similar.

    1. Re:Nagios documentation by hax4bux · · Score: 1

      Have you looked at the sample commands which come w/the distribution?

    2. Re:Nagios documentation by Spad · · Score: 2, Informative

      You'll be wanting:

      http://nagios.sourceforge.net/docs/3_0/configmain.html
      http://nagios.sourceforge.net/docs/3_0/objectdefinitions.html
      http://nagios.sourceforge.net/docs/3_0/configcgi.html

      Initially. There's a lot of stuff that isn't linked directly off the TOC, which is a pain, but it can be found with a bit of digging (or download the PDF and search it).

      The FAQs (http://www.nagios.org/faqs/) also have a fair amount of useful info (Such as why the bloody thing won't use GD2 without a lot of arsing around).

      I'd also recommend the forums here: http://nagios.meulie.net/ (Though they seem to be down at the moment).

    3. Re:Nagios documentation by kosmosik · · Score: 1

      OK but it still does not answer my question how to put long variable into multiple lines. Do I do it "bash" style like\
      that. Or some other way?

      Also the pain - I've said that the documentation is the pain. Also I prefer complete/reference documentation over FAQs/forums.

  25. Nagios is a mess by Kent+Recal · · Score: 2, Insightful

    Blech, nagios is probably the most disgusting hack currently in wide use. It was overdue for a complete rewrite after Nagios2 - but nagios hackers don't seem to have any pain treshold. Nowadays it's not even funny anymore. Nagios has gone *way* over its expiration date. The closest analogy would be a pot of milk that has been sitting in direct sunlight for 6 months straight...

    I strongly suggest that anyone looking for a monitoring solution stays away from the dead horse and looks at the modern alternatives first. There are plenty: Munin, Cacti, Zenoss, Pandora, OpenNMS, just to name a few.

    Most importantly: Take your time before you decide and evaluate thoroughly. A monitoring solution will stick with you for a long time and migrating to a different software is usually a very painful process. Which, btw, is the main reason why so many sites still ride the dead horse...

    1. Re:Nagios is a mess by kosmosik · · Score: 1

      You are partially right - Nagios is a bit legacy.

      But you have mentioned Munin and Cacti - these are just simple graphing solutions. Munin is generaly useless - you have only year/month/day views (or similar), you cannot zoom into fe. 2 hours range last week. Cacti is totally better than Munin.

      But also Cacti is just a simple SNMP pooling and then graphing solution. It has some plugins as tresholds but it really is not that class of solution as Nagios (or better Zabbix).

      Nagios is an *engine* that processes messages. Something like message broker. Cacti is something quite different.

  26. GroundWork by pak9rabid · · Score: 1

    For those of you that aren't particularly fond of the complexity of Nagios' configuration, check out GroundWork. It's basically Nagios + a fairly easy-to-use web interface. We've been using it up at my work for over a year and it works great.

  27. WTF? LOL... by Colin+Smith · · Score: 2, Insightful

    I used nagios for years.. many many years. It has to be, as many have already pointed out.. the most difficult to configure OSS project ever made.


    R$+@$=W $@$1@$H user@thishost -> user@hub
    R$=W!$+ $@$2@$H thishost!user -> user@hub
    R@$=W:$+ $@@$H:$2 @thishost:something
    R$+%$=W $@$>3$1@$2 user%thishost

    Sendmail...

    Nagios is easy, but it only makes sense if you have dozens or hundreds of systems, for less, get something simpler, and it will only work if you understand how to group your hosts, services etc.
     

    --
    Deleted
    1. Re:WTF? LOL... by dubl-u · · Score: 1

      Sendmail is necessarily hard. Mail routing at the time was complicated. Now it's easier, which is why Postfix is a snap to configure for common cases, and why a lot of Sendmail admins never have to see the scary magic at the heart.

      Nagios, on the other hand, is unnecessarily hard. Especially for simple setups and novice users, the pain is ridiculously out of proportion to the gain.

    2. Re:WTF? LOL... by Colin+Smith · · Score: 1

      Nagios, on the other hand, is unnecessarily hard. Especially for simple setups and novice users, the pain is ridiculously out of proportion to the gain.

      But it's not for the simple setups and novices, it even says that in the manual. What Nagios is easier for are those situations where you need to monitor some custom service.

      A monitoring system reflects the complexity of the systems and services it monitors. If you have a relatively simple network with standard services then Nagios probably isn't required. Try Zabbix instead, it handles those situations fairly well.

       

      --
      Deleted
  28. import Nagios data into Cacti? by Anonymous Coward · · Score: 0

    You are comparing apples with oranges, nagios is for service monitoring, cacti is for diagrams.

    If you're polling the devices anyway, why can't you feed the data into a database and draw charts? Why do you need two systems? It all comes from the same source.

    1. Re:import Nagios data into Cacti? by Anonymous Coward · · Score: 0

      because it's the most straightforward thing to do. 99% of the time it is very very cheap to poll, using SNMP for example. Nagios was not designed to replace mrtg, so not surprisingly the solution it gives is like bolted on the system. Try monitoring traffic on hundreds of interfaces and see where nagios gets you. Not pretty. On the other hand, cacti does not have the capabilities of nagios when it comes to service check scheduling. Using both really scales very well. At least in my experience that is.

    2. Re:import Nagios data into Cacti? by perldork · · Score: 1

      If you're polling the devices anyway, why can't you feed the data into a database and draw charts? Why do you need two systems? It all comes from the same source.

      You don't need two tools, the PNP Plugin does RRD graphing from performance data returned from Nagios (there are other add-ons for Nagios that do this as well), PNP is just the most flexible in my opinion.

  29. Hard to set up? by isorox · · Score: 3, Insightful

    So Nagios is hard to set up? Probably, you can't go from zero to running in 5 minutes. It's a steep learning curve, but if the initial investment of a book (I used building a monitoring environment with nagios) and a few hours, you shouldn't be monitoring things. You won't do it correctly, you may as well throw some cron jobs together.

    The first step in monitoring is working out what you want to monitor. The second step is working out what you really want to monitor. The third step is working out how you want to display problems. When you have 60 people in support working on a 6 shift 24/7 pattern, you can't expect emails to be any use. "Service problems" in nagios is fine, but there's a lot of issues that 2nd line don't need to know about -- solaris security patches on an intranet for example, can wait until the 9-5 admins get in.

    Nagios is painfully easy to administer, if you set it up right. Once you know what you're doing (or even know enough to be dangerous, like myself), you can deploy a new nagios installation in about 20 minutes, add a new device that follows existing rules (new web server for example) in under 5 minutes, and a new device with new plugins in half an hour.

    Nagios then grows organically. When something strange and new breaks we cobble a plugin together,

    Configuration is in plain text files, one for each device on the network. I have these as an subversion working copy, which gives me the ability to track changes and easily roll back any configuration problems.

    We have dozens of weird bespoke plugins, one uses WWW:Mechanize and Perl to run through a workflow on a specifc webpage, another looks at the rate of change of growth of a jboss logfile, and the proportion of stack traces, one logs into a remote machine and checks jumbo pings are working through the network.

    We find nagios essential to monitor the service we provide. I don't particularly care if the server an oracle database runs on is pingable, I care if I can log in and run "select 1 from dual" (or usually something more application specific).

    The small system we monitor is made up of about 800 services over 190 devices.

    1. Re:Hard to set up? by WuphonsReach · · Score: 1

      Configuration is in plain text files, one for each device on the network. I have these as an subversion working copy, which gives me the ability to track changes and easily roll back any configuration problems.

      That's a big strength of Nagios (using plain text files). We use FSVS on our servers, with a SVN back-end. It's so nice to be able to track changes and do easy diffs between versions.

      (We use FSVS because it doesn't create .svn folders. It's more suited for version controlling things like /etc or even the entire server.)

      --
      Wolde you bothe eate your cake, and have your cake?
  30. nagios = headache by zeki893 · · Score: 1

    The number of people complaining in this topic about the ease of use of nagios shows that nagios is lacking. Trying to figure out nagios is a waste of time when there are so many alternatives out there that are much easier to use.
    i.e. cacti(mostly for graphing, but can be used for alerts using plugins), pandora FMS, groundworks, sitescope, the Dude.

    1. Re:nagios = headache by Destoo · · Score: 1

      Groundworks is actually just an interface for Nagios. It was very straightforward to set up.
      the Dude is an excellent Windows alternative.
      I'll try to look at Cacti.

      --
      Nouvelles de jeux et technologies en français. TC
  31. Too many amateurs using Nagios by rossz · · Score: 2, Informative

    I have never once personally had any dealings with a properly implemented Nagios system. Every single time it was obviously tossed up by someone who had minimal knowledge of how to properly monitor the infrastructure.

    The biggest complaint I hear is "too many alerts". So set your dependencies properly! You say you did that but you still get 600 alerts when the router dies? That's because you told it you wanted the alerts. See that "u" in "notification_options". That means "unreachable". You want to be alerted when the box can't be reached. You probably wanted "d,r", not "d,r.u".

    The next complaint. It's so much work to add a system. Huh? It takes me about 30 seconds to add another system and all the tests I need. The trick is using host groups to automatically assign tests to a system. For example, using a generic LAMP type server. What can we assume about this? It's running Linux, Apache, MySQL, and Perl or PHP. That's a bunch of tests right off. In my world, SNMP is assumed on all systems (because I made it that way, that's why). So we define a bunch of service checks using SNMP, but instead of using "host_name some_hostname", we use "hostgroup_name lamp-servers". Now when I add a new server, I add "hostgroups lamp-servers" to the definition and like magic it gets all the tests I need: snmp port responding, ssh access, apache daemon running, mysql daemon running, web page accessible, disk space good (defined in snmpd.conf), CPU usage, load average, plus sone automatic dependencies: all snmp tests depend on the snmp port responding. Web pages are dependent on the apache daemon running, etc. I even have some simple graphing included automatically. Even the O/S icons are defined by the hostgroups. Each distro has its own hostgroup which takes care of that detail (e.g. centos-system and ubuntu-system).

    Ten simple lines to define a new hosts can result in 20 service checks. I rarely need to define a new service check. And when a router goes out? One alert for the router.

    Not every system is going to be generic like this, but any time I have more than one system require a specific service check, I create a hostgroup to handle it.

    --
    -- Will program for bandwidth
    1. Re:Too many amateurs using Nagios by perldork · · Score: 1

      Another big help that also is a source of the 'too complex' complaint is using the object-oriented features of the Nagios configuration language to define base node object definitions that define the host type (base definitions also include host group associations for the service-hostgroup-host relationships you mention) .. I use SNMP a lot with Nagios as well, and especially with Nagios 3 I use this feature.

      To continue your example :), I might have a host template for a web server that puts it in the right host groups for agent-less checks, but then I also create host templates that define the custom attributes needed for each type of host-based SNMP agent we have.

      So, for example, for Sysedge this would include hard-coding SNMP version 1, port 1691, etc .. and that agent template can have a Sysedge hostgroup associated with it to give the client whatever Sysedge-specific SNMP tests and generic we have in place as well through hostgroup associations.

      Then the check commands just look for the custom variables I use throughout my config to define snmp port, version, community, etc .. so adding a new web host is then reduced to just 5 attribute lines :p


      define host {
              use my-webhost-template
              hostgroups +sysedge-agent
              host_name myhost.example.com
              address 192.168.1.1
              parents my-host-parent
      }

      Yes, this requires thought and planning ahead of time but the payoffs in mid-long term maintenance and scaling of a configuration are huge.

    2. Re:Too many amateurs using Nagios by rossz · · Score: 1

      That's similar to what I've been doing. When I said ten lines to define a new host, I misspoke. It's actually one more line than you used because I always define an alias, too.

      One thing I haven't worked out a simple solution for. There is not simple way to override the notification group, so I have to define a whole new template for every single service check if I don't want to use the default group. Perhaps I'm overlooking the obvious?

      I still have all the configuration files from the very first nagios system I set up. It's to remind me of how badly things will turn out if you don't know what you are doing.

      --
      -- Will program for bandwidth
    3. Re:Too many amateurs using Nagios by Da+Web+Guru · · Score: 1

      The Nagios config files are atrocious. Trying to navigate through them is sometimes like an exercise in insanity.

      That being said, if you are in a large organization (e.g., a large web hosting company with multiple datacenters) that needs to monitor thousands of services on thousands of hosts, it can be done. However, you can't go mucking around in the config files all willy-nilly. You have to build a framework around them. At the hosting company I work for, we have deployed Nagios collector nodes in multiple facilities that all report back to a parent node that allows us to immediately see what is down, what platform it is a part of, and what city it is in at a glance. (It also has a link back into a database that provides more details about that particular server.) I don't work a whole lot with designing the actual configuration files (most of the configs are insane), however I do manage the web-base back-end database that we use to manage all of the services and hosts. (They tell me "the config file needs to look like this, make it happen.") You can't deploy Nagios on a large scale without an administrative back end orchestrating nearly *everything* (including services that are available for monitoring, which hosts to monitor, escalation paths and contacts, templating, etc.), especially if multiple people in multiple locations have to manage various parts of it. (The fact that Nagios does not have a built-in database-driven config system was an annoyance that we had to work around.) To do so would be an exercise in futility, especially with multiple offices with different services to monitor and different escalation requirements and different host and service templates and different levels of access to manage. Somehow we (a very small team with tons of other stuff to do as well) have managed to get it right and are currently deployed in 4 (soon to be 5) different facilities.

      --

      --guru

    4. Re:Too many amateurs using Nagios by perldork · · Score: 1

      Do you mean the contact group? Yes, I haven't found a terrific way around that either. I will define a base service template for each specific customer / group that inherits from my global template .. it has the contact group(s) that make sense for that customer in them and just inherit from that template for each service I define for the new customer or group, but that still involves a one to two line template per service for each service the customer wants.

      Hmm .. that is a really good point and breaks having services all mapped to host groups globally if different groups want only want someone in their group to be notified for that set of services (instead of a global group), doesn't it?

      Have to think about how to resolve that one :).

    5. Re:Too many amateurs using Nagios by perldork · · Score: 1

      Ok, here is one way to handle this, not perfect, but better than having to redefine every service just to get a new contact group.

      Say we have a net-snmp hostgroup that acts as a container for all of our nifty Net-SNMP based services (service to hostgroup relationship).

      Now group west-coast wants to have all of those checks but they want to be notified via their contacts not the default service contacts.

      You can:
      * Turn off notification for all services that are associated with host groups in global mappings .. so notification is off by default.
      * Turn on regexp matching in the Nagios config
      * define a new serviceescalation object that catches all the hosts in your west coast group
      * Define contacts in the escalation object

      Example:
      * West coast hosts have the string 'west-' in the
      name (if they don't just create a new host group that lists them all
      * Contact group for service checks on the host is west-coast-contacts

      ; West coast hosts custom notification rules
      define serviceescalation{
              host_name .*-west-.*
              service_description West coast notification
              contact_groups west-coast-contacts
              first_notification 1 ; Notify immediately
              last_notification 0 ; Notify as long as this is an issue
              notification_interval 60 ; or whatever makes sense
              escalation_period 24x7
              escalation_options c ; Only notify on critical problems
      }

      Now you don't have to change your global service to host group mappings, you can just add group-specific hosts to your global groups and then control the service notifications through serviceescalation objects.

      Make sense?

    6. Re:Too many amateurs using Nagios by rossz · · Score: 1

      I've tried every variation of that without success. The only difference is I left notifications enabled since I still want all alerts to go to the default contact group. I'm using version 3.0.1. Perhaps a bug fixed in the latest?

      --
      -- Will program for bandwidth
    7. Re:Too many amateurs using Nagios by perldork · · Score: 1

      Did you enable regular expression matching in nagios.cfg? Have to set

      use_regexp_matching=1

      in nagios.cfg ...

      I will try this tonight myself to see what results I get.

    8. Re:Too many amateurs using Nagios by rossz · · Score: 1

      Yep, I turned on regexp. I've since turned it off as it didn't work and regexp can have some unexpected side-effects I wish to avoid.

      --
      -- Will program for bandwidth
    9. Re:Too many amateurs using Nagios by rossz · · Score: 1

      Turned out to be an error in my serviceesclation configuration. Stupid me. It works perfectly now.

      --
      -- Will program for bandwidth
  32. Ease of use? by VoidCrow · · Score: 1

    I always found it an absolute pig to configure.

  33. Ubuntu 8.04 works effortlessly with OpenNMS by Nick+Driver · · Score: 1

    I tried Nagios and getting it installed, set up and running seemed to be an exercise in futility. The purpose of an NMS is supposed to accomplish saving you some work, not creating more work to set it up and babysit it.

    I then tried OpenNMS and never looked back. Installation and configs were a piece of cake. I installed it to a fresh install of Ubuntu Hardy Heron desktop edition and was up and running in about two hours. Everything just simply worked right the first time.

    I've been using OpenNMS to monitor a 1000 user voip network with tons of cisco switches, routers and call manager servers scattered across 8 buildings for about four months now and it's working like a dream.

  34. NAGIOS = STEAMING PILE OF CRAP by Anonymous Coward · · Score: 0

    When I was evaluating Network Monitoring packages I installed most of the major packages in VM's to test drive all of their features. Out of all of the packages I installed Nagios was the most unrealistic and unscalable package of them all. Their approach of editing a single configuration file makes it impossible to use on a "real" network with many thousand hosts. Once you get above 10 servers it gets really cumbersome to manage because of the syntax of their config file. If you have a few crappy windows systems and linux hosts to monitor Nagios is tolerable. For a real enterprise monitoring tool its a joke.

  35. Alert by Anonymous Coward · · Score: 0

    Get used to seeing the message

    ** PROBLEM alert - nagios is down **

  36. nagios is great NMS by kokoko1 · · Score: 0

    I have configured nagios for mid size organization for monitoring servers such as Linux, M$ plus over 1200 services and also for monitoring cisco routers and switches. nagios is not difficult to install/configure however while monitoring 1000+ can't be done in single day. most of the time one have to copy/paste the configuration for identical hosts. Once installed and configured properly nagios is a great NMS for sysadmins in term of notifications of server or service failure. Beauty of nagios is that you can integrated it to varity of services such jabber, email, sms, alarms. nagios will notify service/server failure by email in my case i have configured it to integrate it with our sms gateway (running gnokii) to notify not only just giving alerts on web interface but also by sending emails to conern ppl and also SMS. I found it a great asset for any IT dept and no organziation these days lives without a NMS and nagios is great NMS.

    --
    http://askaralikhan.blogspot.com/
  37. Green is Good - Red is Bad by Anonymous Coward · · Score: 0

    Hobbit is much easier to configure and use

  38. OpenNMS and Hyperic by mu51c10rd · · Score: 1

    I migrated from Nagios to OpenNMS/Hyperic and never looked back. Between the Hyperic agent and its granular statistics, and OpenNMS's alerting, mapping, and autodiscovery, I would never use Nagios in an enterprise environment again.

  39. Autodiscovery by mu51c10rd · · Score: 1

    I prefer autodiscovery. When a new device comes up, I don't have to do anything. It magically is part of the correct groups, is graphing, and alerts are done without me lifing a finger. Much better than needing to connect to the box, editing the config files, and reloading the config.

    1. Re:Autodiscovery by perldork · · Score: 1

      And puppet makes a great companion to Nagios or any other NNM for this .. why might this be better than having the NNM scan everything on your networks looking for new devices?
      * Puppet knows your host, so it can deterministically place the host in the right groups using the rules you define for determining where the host belongs .. much less fuzzy than finding a host and then figuring out from the outside what kind of host the host is
      * Puppet can send data off to your CMDB at the same time
      * Puppet can check a template into SVN (or whatever SCM system is in use) and trigger a reload of Nagios or other NNMs without user intervention.

      Just using Puppet for this example because it, like Nagios, is so flexible and so easy to integrate with any other systems in use at an organization.

  40. no "free e-book download" by ApproachingLinux · · Score: 1

    I noticed that the cover says "free e-book download" in the upper right. this is slightly off-topic (of Nagios), and somewhat on-topic (Syngress). if you buy these books with the understanding that you will get a free PDF copy to have around, this is no longer true since they were bought by elsevier. The old web site where you could register the books and get free downloads is gone and there is a new service on the elsevier web site, but no way to register your recently purchased books. there is a faq that tells you how to "register" your book, but this requires a "redemption code" that is not in the book. i've written to them to get a redemption code and they haven't responded. buyer beware.

    1. Re:no "free e-book download" by perldork · · Score: 1

      Things have been a mess since they reorganized. If you bought the book, please write either of these two people to get your situation straightened out (they are both on the management team for the book):

      Colantoni, Laura (ELS-BUR) - L.Colantoni@elsevier.com

      Cater, Matthew (ELS-BUR) - M.Cater@elsevier.com

      They are responsive and will do their best to work this out for you .. respond to this thread if you don't hear from them in the next few days.

      I am one of the authors of the book (Max Schubert), and I have been very disappointed myself with how poorly this transition has happened, frustrating because nothing authors can do to help and you are not the first customer to complain .. they have also misplaced the VMWare image promised with the book and have yet to get that online. Grr.

      I hope they respond quickly to you.

    2. Re:no "free e-book download" by perldork · · Score: 1

      I am sorry you haven't heard back from them. You can write either of these two people for help:

      Cater, Matthew - M.Cater@elsevier.com

      Colantoni, Laura - L.Colantoni@elsevier.com

      Laura is a managing editor on the book and Matt worked extensively on it as well. They are both nice people and both have indicated that they wish to straighten out the problems we have had post publication, which includes your complaint and a number of others .. some of which they have resolved, and some which they still haven't 4 months after the book came out. Grrr.

  41. What a tool... by NateTech · · Score: 1

    "Due to its extensibility and ease of use, no device or situation has yet been found that cannot be monitored using Nagios and a pre-made or custom script, plug-in or enhancement."

    It's so true! We put an air-actuated sensor on his pants, and now we get e-mail and SMS whenever grandpa farts. Thank you Nagios!

    --
    +++OK ATH
  42. For anyone who wants the code from the book .... by perldork · · Score: 1

    Hi,

    Our publisher really dropped the ball on this and all the authors involved with the book (including me) are very sorry about the problems purchasers of the book have been having getting access to the code etc in the book.

    The author team created this site

    http://www.nagios3book.com/

    because we saw early on the problems Syngress was having getting it's act together; on the site you will also find the names and email addresses of the book's management team, they have all promised to do their best to help anyone out who has purchased the book and has complaints about the electronic content that was supposed to accompany the book.