Slashdot Mirror


Monitoring the Health of Your Penguin?

codepunk asks; "I work for a large manufacturing firm in the midwest, working on a migration from Windows to Linux in the data center. We just completed installation of two full Oracle RAC 9i clusters. We are also in the process of configuring two clusters for our manufacturing floor's Linux desktop roll out. The machines that make up our data center are all Compaq Proliant Series machines. In order to facilitate hardware maintenance we are in bad need of a monitoring solution. HP offers Insight Manager as well as the Compaq Health Agents. This solution would seem like a natural but the drivers installed by these solutions are binary only. We have never managed to get these to work correctly and are really concerned about the stability of our systems with these modules loaded. We are not opposed to buying hardware in the future from a vendor that provides a more open solution. We are also not opposed to buying a open third party solution. Slashdot, what do you use for Linux system hardware monitoring?"

1 of 45 comments (clear)

  1. RMON, SNMP, perl, and an extensible system by snopes · · Score: 3, Insightful

    Here's how I'd suggest approaching the problem. Look into the platform MIBs. Find out what you can query values for. You should at least be able to get some binary type "fan working", "power supply working", etc. type stuff. Then get yourself an easily extensible monitoring system. Frankly, BigBrother is anitquated and a pia to manage. Other recommendations made here are reasonable, but I'd suggest mon. It's not a monitoring system per se. It is a scheduling framework with concepts of monitor and alert built-in. Many monitors and alerts are availble, but best of all it's really easy to write your own. For such things (for most things), I like perl.