What Do You Use for SNMP Monitoring?
linuxi386 wonders: "My company is in the process of implementing a global frame relay system. The network will cover 20+ states, and several European and Asian countries and Australia. It will have a 5 point full mesh fail-over with each coast/country having about 20 ppp links about 30 servers mixed between linux and windows plus a 2003 domain controller at each site. I have been looking for a really decent cheap web based monitoring application to maintain the entire system. So far I have looked at Solarwind's Orion and Adventnet's Opmanager. I like the look of Orion, but while I prefer the feature base of Opmanager, I cannot stand its pricing model or the XP playskool style theme it uses. I am trying to avoid writing my own system to manage this if at all possible. What would you folks recommend and why?"
I'm not very familiar with the other 2, but I believe Netreo is in the same space... it's what UC Irvine uses, I think.
We have a medium sized setup and for us, Cacti works great. http://www.cacti.net/
We use level platforms.
Just google for "full mesh fail-over" "ppp links" and...no, wait, forget that....
I'm posting as an AC so I don't break any I.P. and/or NDA's.
At the companies I've worked at, we have typically started with the free monitoring software package Nagios and after a shortperiod of time, purchased the commercial product NetCool. NetCool is everything you could ever ask for... assuming you have a few months to tweak the rules to set the event levels correctly... But I guess all monitoring systems are like that.
Depending on the size of your NOC, your datacenter, and your client base, I would recommend starting with Nagios and, if it proves to be too small for your needs, move the NetCool. (Just be prepared to pay serious $$$ for NetCool)
HTH
A.Coward
If your company is willing to spend that much money on the network, a 'cheap' NMS tool is the wrong solution. Too often companies invest in technology only to skimp on the management of that technology. The end result is overall poor performance and dissatisfaction with the technology. I would suggest a real NMS tool such as OpenView.
It seems like if you are spending all the money on that equipment, you might not want to go with a "cheap" solution. There should always be a good budget for software in any project. You want it to be powerful enough. That said, you shouldn't discount the free/cheap solutions just because they are free/cheap/open source. That's my 2cents.
All sigs are created equal.
I've used the Solar Winds software suite, HP's OpenView, and CiscoWorks myself for managing infrastructure for about 3500 devices. CiscoWorks is slow, but has tons of features if you're working with a lot of Cisco devices. OpenView is good for generating logical maps and managing a heterogeneous network with a lot of different devices. Solar Winds and OpenView were about the same in functionality for me. Out of the 3 I've used, I have always thought that I could do better, but I just don't have the time. I know there is OpenNMS, but I've never tried it. Enterprise Management suites are generally useful, but a lot are more of hindrance. If you find a good one, let me know. Good Luck on the search.
Nagios is a fairly easy-to-learn, extremely extensible (can you use a scripting language?) monitoring system. It scales reasonably well, distributed stat gathering, can respond to SNMP traps, etc. Not the easiest out of the box (you'll spend a day or two learning to use it and set it up), but there's very little you can't make it do.
Help save the critically endangered Blue Iguana
Is a 'theme' really going to turn you off a piece of software? Ask the company if you can have it re-branded. Many companies will do this for free, especially web-based tools... and if they don't, well it's web based... there are stylesheets, graphics and html, it really shouldn't be that hard to make some radical visual changes without too much work.
So go with the tool that works best, looks are pretty easy to adjust, as long as usability is there to begin with... if it's clunky, confusing and you hate how it looks... well that would take a bigger commitment to fix than just looks but it's been done before. Example... I once completely redesigned the UI for Bugzilla, canned queries, new workflows, collapsing panes, calendar widgets, color coding and more... but it was worth it in the end and that company still uses it 90% the way I left it. Which means it wasn't wasted effort.
Well, think about it anyways.
A fool throws a stone into a well and a thousand sages can not remove it.
On a large cluster, we considered OpsManager, Cacti, and Ganglia, and have run all 3.
OpsManager has some real nice features which made it easy to display and group results, especially to non-engineering people (good graphing tools built in, etc), but we found it didn't perform as well as the other 2. Addtionally, you have to pay for it.
Cacti was nice because of the built in hooks for apache and MySql, but it didn't have some features we wanted (auto host discovery, certain data summarization)
We use Ganglia now. Its open source and has a good track record on large clusters, and has proven speed and reliability. It will do auto discovery of hosts, but the downside was that there were no built in hooks for MySql and Apache, which we did want to monitor.
So consider the set of data that you're interested in monitoring, and how big you intend on scaling it.
if your network has a certain size and you do everything by SNMP, you need to be able to correlate the events to avoid alarm floods when one link goes down. We have used Openservice's Nervecenter with great success, coupled with NetCool from IBM. The pricing is steep, but the products are top-notch. In our configuration, we monitor about 8'000 network devices (Cisco, 3com, Bay, Nokia-IPSO, Consentry, etc) using 2 Nervecenter running on 2 Sun 480 boxes.
(I'm not affiliated with these companies or products)
How is it that you're obviously spending a huge amount on the network infrastructure and want to cut costs so much on network monitoring? After going to all the effort of setting up you'll want a decent tool that tells you the instant something is wrong - and before the users tell you!
Something like HP OpenView does the job. Cisco have a sw tool but not as good, as do Sunand IBM. CA Unicentre is overkill and too expensive to my mind. For small jobs (less than 100 nodes) I've used Ipswitch Whatsup Professional. You want something that goes inside your switches and has agents for all your servers if you want to monitor properly.
In the dim past (10+yrs ago) I used Scotty (a Tcl/Tk freeware tool) and at other times wrote my own in Python/TK with Perl daemons/services.
net-snmp on sourceforge has tools you can use but to my mind these days, again I'd say - it's an expensive (and I presume important) network your've got there, so spend some money to monitor it properly. The expensive tools ($30k+) all have ready made agents or know about a huge variety of hw so you don't have to customise MIBs and code (though Unicentre takes a lot of customisation to work well and they all need customisation of sorts). It might take you 3 months to do a half decent job coding yourself that a commercial package could do with more features in a few weeks and you've got support and someone to complain to if there're problems. How much money would be lost when the network goes down in those three months? Just one hour for a large corporation would cover the cost of the sw.
I do agree it's great fun rolling your own (I'm sure you're a great programmer) if you have the time and the corporate managers don't appreciate the need to monitor things properly and you can't convince them to spend the dollars - but when it goes down it'll be your arse and the managers'/company's money being lost while you sweat to fix things - they'll quickly tell you then (and rightly so) it would have been worth doing it right the first time (you didn't think they'd take the heat for this now did you?) no matter how good your code will look in just another months time.
At worst write some emails as evidence that you requested such and such a package with official quotes and have their replies on record they refused to spend the money on it. I know of one company that went to the wall when the network went down (chain of retail stores) and a series of seemingly small faults on critical days (like the last shopping days before christmas) meant the company went under and the IT consultants who designed the system took the blame in court in the end - cost them $30m (plus a few hundred ppl lost jobs).
Now if this is just some academic network or it's not your responsibility then fine (mind you many research places are even more fussy about their networks than corporate users).
Unfortunately there are times when jumping into coding, nomatter how well intentioned, isn't the most pragmatic or best solution.
pithy comment
SNMP? That's complicated stuff to set up.
At work we rely on th much more robust, and easy to use URMP ("User Resource Management Protocal") to monitor our systems. When the systems go down, the users let us know about it.
If historical data is a requirement for MRTG, for a small installation, you can easily script a daily or weekly archive of the MRTG HTML (and data) directories. Presto -- historical archive, with picures and everything.
I use InteMapper from Dartware. It is simple, intuitive, and cheap.
Comment removed based on user account deletion
Take a look at Intellipool Network Monitor, you can find it at http://www.intellipool.se./ The pricing is fair and their support is excellent. INM also supports distributed monitoring, ie. if you have geographicly diverse location you can set up multiple monitors thats slaved to the main-server. Support for adding custom SNMP mibs exists.
--- Reality doesn't care about your opinions, it happens anyway and if you are in the way you'll get squished.
SNMP == "Security is Not My Problem" ;-)
Seriously though, what you use is entirely dependant on what it is you exactly want to monitor.
It is trivial to write up simple net-snmp based pollers to push into RRDTool for graphing (my preferred method for generating traffic stats, after polling for which interfaces are administratively and operationally up, saves on having to configure what interfaces to monitor as you do with MRTG). Same data can also be pushed into whatever you use for historical logs.
If you dont want to roll your own just use Nagios
We use Solar Winds and SNMPC together, however, What's up Gold is useful. Both Solar Winds and SNMPC are very powerful tools that can monitor large, widespread networks.
Although a PITA to set up, Netdisco http://www.netdisco.org/ is a pretty awesome Open Source solution.
strike
"Someone needs to talk to the tree of liberty about its ghoulish drinking problem." by ohnocitizen
I've used NMIS to good effect, really love this tool. Also don't forget something like rancid either, it will save your life at some point.
I found a fast warez site: http://warez.it.kth.se
I've worked with HP OpenView for a few years and it's a very rock solid tool, you just got to know how to use it. The thing i liked the most it's the integration with different Hardware specific solutions like Sun's, Fujitsu and Compaq/HP. It's the solution used in one of the major telecom companies in the world so it's gotta be good... or the HP suits are very entertaining ;-)
C 'ya
The Ganglia ebuild has just been broken on Gentoo by the gcc-4.1.1 upgrade.
In fact, scores of packages that were in stable now bomb out on compile. I guess the real problem is that they're still in Portage as "stable", despite no longer compiling. Pretty annoying.
It seems that gcc-4.1.1 is extraordinarily fascist about templates and lvalues now, rejecting code that's been stable for ages. Ganglia is just one example of many.
I just went through a similiar issue where we wanted to monitor multiple clients sites. This included all Servers, Workstations, along with SNMP devices. I tested a lot of products, most of the ones mentioned here. I ended up with Advanced Host Monitor. The feature set is very robust. Its for the most part agentless and has a distributed option for monitoring remote sites easily. Support for Linux, FreeBSD and more. The developers so far have been very responsive and the user forums are active. The best part is the price. I hate the per node licensing model which is why I really like this product. For a few hundered dollars you get get an unlimited node version of the product (well a 20,000,000 node limit). I would recommend that anyone considering a monitoring solution at least give it a look.
Frame Relay, like ATM, is quickly becoming yesterday's technology. Some major Tier-1 providers are already phasing out their native Frame and ATM networks in favor of MPLS and MPLS VPNs. You may find that your investment in Frame Relay switches (Nortel?) will seem like folly in only a year or two.
Check out this software called Ubersmith. It started out as a billing system, but then grew into a Billing, Support, Device Manager/Monitoring package all integrated into one package. My company uses the Pro version of their software which lets you do SNMP monitoring. The cool thing about this program is that it's all nicely integrated. I have a client, I can click on their profile, view their services, and check out any devices associated with them. It will automatically notify admins or users if a device is down, and even supports APC's for built in remote reboot. Their new software that's coming out called Datacenter Edition looks to be incredibly cool: http://www.ubersmith.com/de_preview/
So, let me get this straight, you're building a GLOBAL frame relay system, with nodes in 20+ states, with massive redundancy, and you're looking for a CHEAP system?
Get yourself together and look for a GOOD system. If you're already spending TONS of money, you might aswell spend some more to get exactly what you want, instead of settling for something. It might turn out that a free system is the best system for you, but please, good HAS GOT TO come before cheap!
Move sig!
I've been looking for a replacement for a homegrown system management infrastructure for some time, anyone check out Hyperic yet? It seems to have a good list of supported applications, and layers, plus some smart modern approaches.
http://www.zabbix.org/
Deleted
http://www.sciencelogic.com/
We have an EM7 appliance. We're currently monitoring aprox 100 devices. 80 are servers, the rest network devices.
I'm a fan of InterMapper, powerful but not overly complicated, and easily extensible. It also runs on MacOSX, Windows, Linux, Solaris, and FreeBSD. It was originally developed at Dartmouth College to support their network, and has been marketed commercially since 1996.
There was an article on /. last week http://linux.slashdot.org/article.pl?sid=06/08/28/ 1839201 that mentioned Zenoss. I downloaded the VM appliance and have been using it since. It can use Nagios plug-ins or you can write your own. It's OSS and they have paid support options if you want. There's even an option to monitor WMI events. Nice graphs and alerting options. www.zenoss.com
I also use Nagios and mrtg.
You could take a look at: http://www.sysorb.com/
:)
:)
> The network will cover 20+ states, and several European and Asian countries and Australia.
* Our system allows for "satellites" which are remote monitoring stations allowing you to perform checks against a given node from several remote locations.
* Our system works well even in NAT'ed setups where several remote private-network sites report in status info to a central monitoring server
* You can even delegate administrative tasks, so that the asian administrators can view only their own systems but may have administrative privileges there, for example
> about 30 servers mixed between linux and windows
* In addition to SNMP and various network checks (HTTP/...) we provide an agent which provides detailed system monitoring data for your servers - this agent runs on both Windows and the most common Linux distributions (in addition to NetWare/HP-UX/Solaris/...)
> really decent cheap web based monitoring application to maintain the entire system.
* You can get on-line quotes on our site, or send a mail to sales@... they will answer
* Both configuration and day-to-day monitoring operations is completely web based
* We have clever web-based configuration system to make it easy to set up detailed monitoring on large networks
All in all, there's pros and cons to all the systems you mention and (no surprise) to ours as well. No two are really alike. My best advise to you is to take a look around, get some free trials going and see what you like. And talk to the vendors. If they won't help you during the trials, they probably won't either when you face real problems later on.
In spite of this being a blatant plug for the company I work for, I hope the moderators will go lightly on this post since it is completely on topic and specifically answers the question asked. Thank you very much
MRTG
Oldie but a goodie.
-d
"Here Lies Philip J. Fry, named for his uncle, to carry on his spirit"
I like Groundwork Monitor which is pretty much a front end for configuring Nagios.
l _download.html
http://www.groundworkopensource.com/downloads/ful
I am a NMS almost full time for a medium scale telco/isp. IMHO cacti, nagios, JFFNMS, and other 'free' network managment solutions don't do the job. They are too flaky, unstable, and unsuported for large scale industy usage. your best bet, would be somthing along the lines of a Whats Up Professional, or somthing simliar. The reason being that you get support, and you have somone to call when you have issues. If you'd like to sort thought the open source BS of most of the free NMS's then go for it, but don't expect it to work in a pinch. For example one of the issues with nagios, is that it uses an enjection queue, and once you get 1,000's of devices in there you can potentially have hours of delay for a notice just because it takes so long to process the queue. Furthermore you have to mess around with text files and the like, and one wrong fuck up and your hosed, your NMS is down till you can find the single typo in a 4,000 line file. There are other supported comercial (yea thats a scary slash dot work) but still. Going retail is your best bet. Just be prepaired to buy a uber beefy box and spend quite a bit of money, and have a steep learning curve. Or just hire an intern
http://www.kernel.org/software/mon/ I was one of the implementation crew for small noc (about 7 people incl. managers) and approx 150 machines in various locations.. I reviewed quite a lot of free software and while most of them where looking quite nice (nagios/bigbrother/etc.), allmost all of them where filled with features that where really not essential just for "monitor the healt of the system" so i ended up with mon. Mon, for me was really the "unix way" of creating stuff, make things easy/simple and extend it with other tools.. The generic layout we used was net-snmp on client hosts either being polled in intervals or sending traps to the main machines.
yush
For a network like yours, you do not want to "do it yourself" with Nagios. Nagios is the best network monitoring package available, but unless you have a full-time system admin dedicated to it, you will be in a world of pain. A better plan would be to look at Groundwork Monitor Professional (www.groundworkopensource.com). The core of GMP is Nagios, but Groundwork have added plenty of integration goodness (profiles of service checks for particular servers: got an Exchange box but don't know which services to monitor; no problem, just use the Exchange profile containing all of the important service checks for Exchange). Full GUI configuration, SNMP traps, graphing, the whole shebang. US$16,000 a year for unlimited devices plus support. Get Sheila at Groundwork to walk you through a Webex presentation and download Rich Trezza's VMware appliance from http://richard.trezza.us/vmach/index.html
The VM only contains the basic open source functionality, but it still kicks any available Nagios configuration package.
Is there any package out there that allows you to interact with the SNMP devices, and not just watch them?
You never did serious worldwide telco stuff if you recommend MRTG the way you do...
It's a nice Quick tool, and there it ends...
In other words, why should anyone take any of your comments serious???
Aruba by Valencia Systems http://www.valenciasystems.com/
I'd question the idea of frame relay in this day and age. Local connections with MPLS tagging should be cheaper and easier to manage. Friend of mine rolled out 140 odd site globally with MPLS. Cut price in half if I recall correctly.
kashani
- Why is the ninja... so deadly?
I used to work for a regional ISP and we used SiteScope with pretty good luck. They went a little crazy with their pricing several years back and it's more in line with some of the "Big Tools" pricing wise, but still easier to use. It has all sorts of built-in server monitors which will probably be of no use to you, but the SNMP section is quite extensible and the alerts are extremely flexible. It's at:
a bility-center/sitescope/
http://www.mercury.com/us/products/business-avail
I am also now using a less expensive tool called IPCheck. The company also makes a not-free MRTG ripoff called PRTG that adds some features and functionality that a plain-jane MRTG installation doesn't have. Their URL is:
http://www.paessler.com/
I use WhatsUp and a suite of products called WebNM from Somix.
I can monitor and alert on ANYTHING I want with this this package. If I have questions, the support is AWESOME.
I used to use a lot of free tools like but the administration was time consuming (see MRTG).
When people discuss this issue they usually forget to make a distinction between fault monitoring and data gathering for historical trending (like, what has my traffic looked like this past year). Most tools are only very good at one of these tasks, while the other is a so-so add-on.
For data collection and graphing, I've found cricket (link) to be very good. Once you've learned it, you can easily add new snmp OIDs into monitoring. In my experience that's been important because there are often new, sometimes proprietary, OIDs that need to be polled. I think it beats cacti for ease of use and clarity. It uses rrdtool for storage, so you can easily keep / roll-up data for a very long period of time without running out of disk space. Its "config tree" concept is great. It is the MRTG replacement, par excellence.
It has some trapping functionality, but it doesn't really seem to be equal to that of other tools that focus on fault monitoring. It's front-end/display is somewhat limited (but not hard to modify). But, I use it just for gathering the data and my colleagues have written a totally different front-end/display for it.
We use Tivoli Netview coupled with Magnum Coordinator. It presents a logical picture of our network and also allows us to choose what events/alarms we are notified of. Our NOC relies heavily on it.
We also use CiscoWorks for large scale configuration changes.
Admittedly I've only used two or three products, but for what you suggest you've got, I'd defiantly recommend ipClarity ( www.ipClarity.com ), its the best for ease of set up, and immediate results. When I subscribed to their service, i was getting web reports pretty much straight away, and it only took about 2 mins to set up. I'm pretty sure they have a US office now as well. Cheers
Sorry, but it's 2006 and Frame-Delay is well beyond its useful lifespan (apart from using it as an access protocol on MPLS to deliver multiple services to a CPE and thus to the client site). Expect costs from Frame to increase as access to experts and equipment gets increasingly difficult. Expect your service levels to decrease as nobody invests in this technology anymore. Have fun trying to get differentiated services to work on Frame by using multiple DLCIs with queueing/shaping on the CPE.
Also, WAN monitoring and management activities are outsourced to the provider. Client companies manage service levels and the outsourcer but not the technology. Only the most stubborn or inexperienced IT staff still believes that they add value in the fault/change/performance management process - unless your provider is really that crappy as your evaluation of the provider was shite.
Tell you boss to hire a grown up IT architect and to let you move on. If he still believes that he should let you play, get MRTG (RRD) to track FECN/BECN (packets exceeding the queue threshold) on the Frame PVCs, track packets with the DE bit set (packets exceeding your CIR) and correlated this data to gain insight into the crappy service you will be getting (in terms of latency/troughput). If you use Frame, you'll need this information to beat your provider up on an ongoing basis. And use Nagios or similar to track the site downtime.
Lastly, any good provider will give you access to this information on a web portal that shows keep performance indicators and service level compliance for your services. See note regarding evelaution of service providers above.
We have used HP Openview Network Node Manager, Openview Performance Insight, What's Up, and the older Sun Net Manager. So we are far from cheap.
The best solution we have found so far? Product called SysOrb. http://www.evalesco.com/ . The price is unbeatable for the feature set.
And it blows What's Up (Crap) Professional out of the water.
You can use SNMP queries on devices, install Agents on servers to be monitored, and even run simple Netchecks like seeing if there is an httpd responding on port 80, or even monitor for text in html.. on and on.
If Microsoft was never created, who would we have to hate?
I like it, and it is way too easy to setup.
Hyperic has a enterprise and a community edition, so you can try out and decide if you need enterprise support and features.
http://www.hyperic.com/
Alvaro
Try argus. http://argus.tcp4me.com./ We've used it for a few years and have had great luck with it. It's simple to set up, and simple to extend.
-- "Big Brother is Watching..."
Well, I suppose that it depends on your budget.
IBM's Tivoli is something to look at
BMC's Patrol
HP's Openview
NetIQ (which I hated)
Nortel's Optivity
Sun's Solstice
CA's Unicenter
Any one of the those ought to be able to do anything and everything you're asking for. Out of that list, I personally prefer BMC, but that's me.
2 cents,
QueenB
HDGary secures my bank
At my company, we use AdventNet's OpManager and like it a lot. The price was right for us, and for us their pricing model works well. Pricing based on the number of technicians instead of number of monitored nodes works perfectly for a company like mine with few admins/techs but many devices. Paying per node just gets unwieldy, especially if your environment is changing or growing rapidly.
OpManager is fully capable of what you're looking for, I think you should give it another look despite your feelings about the pricing.
And who cares about the theme of a monitoring system? Does it work? If so, then is a shiny WinXP-like theme really a deal breaker? Come on...
I continue to be utterly amazed that people still take network monitoring for granted when building out highly complex networks. (For the sake of argument, we'll ignore that cheap monitoring is probably appropriate for legacy technology like frame relay...)
Since you're installing frame relay, I assume that you're using hardware from the Evil Empire (Cisco), so CiscoWorks is a perfectly adequate SNMP element manager for the Cisco hardware. However, if you're interested in more than just monitoring your routers, which would include servers, UPS, even facilities hardware like generators and air conditioners, you want a true enterprise NMS. And no, the open source freeware packages DO NOT qualify as enterprise- sorry to all of you advocates of What's Up and Big Brother and Nagios and all those other poser programs.
Unfortunately, you're going to have to lay out some money to get the kind of monitoring that you want. I can't speak for Unicenter from CA, but the full Unicenter package might be more than is required- a good start from CA would be Spectrum (acquired in the Concord/Aprisma purchase last year) for outstanding SNMP monitoring, including services and process management- they even have a Frame Relay focused module that would work well for you. Of course, there's always Tivoli- if you have more money than sense- or HP OpenView or Smarts from EMC.
Please, by all that is holy, don't go for the crappy cheap solution- you've already spent more than you need to if you've bought Cisco gear (remember, you can buy better than Cisco, but you can't pay more), so it shows that you're willing to spend money. Get a management solution that's worthy of your investment.
Stay hard.
I would recommend Big Brother (http://bb4.com) because:
- The free version works great
- It can be Linux/Unix based (would recommend) or run from Windows
- It gives a simple view of all network connected devices either on 1 or several pages, depending how you configure it
- Can utilise paging / alert acknowleging etc.
- There are many external scripts available at http://www.deadcat.net/ for specialised checking
- It is easy enough to write your own external scripts if you know the basics of shell scripting
- It integrates with LARRD - an RRD based graphical tool that gives you good looking graphs
Hello Cliff, The following is some information on CITTIO's WatchTower product. Based in San Francisco, CITTIO is a leading provider of system monitoring software. WatchTower, CITTIO's flagship product, is an enterprise system monitoring and management software application that runs within a Web-based portal environment. CITTIO has a number of success stories enabling desperate, multi-location networks, including Gymboree, Mervyn's, Pacific Sunwear, and Pizza Hut. Some key benefits include: - Increased IT productivity via proactive systems and network monitoring - Increased visibility into the entire network, including desperate systems and locations - Reduced downtime to ensure maximum business value is achieved from your IT and Ecommerce Systems - Easy to install, administer, and use at a cost effective price Please respond with any additional questions!
Cliff,
www.snmp.com for product information and technical discussion of your requirements.