Ask Slashdot: Performance Monitoring for Linux
muadib
wants to know about the following:
"Given the current discussions on tuning, I am trying to
find out if there are any performance monitoring
applications for Linux. I don't mean things like xload,
xosview, etc which provide only a small amount of data.
For anyone who's done benchmarking under NT, I mean
something like their built in perfmon utility that lets
you view and capture just about any statistic on your
system or on a remote system. Capturing is the specific
functionality I'm looking for b/c I'm working on a Linux
device driver, and it would be nice to have historical data
of CPU utilization, interrupts/s, etc. so that I can
compare complete system perfomance between code revisions."
Moodss is a modular application. It displays data described and updated in one or more independent modules loaded when the application is started. Data is originally displayed in tables. Graphical views (graph, bar, 3D pie charts, ...), summary tables (with current,average, minimum and maximum values) and free text viewers can be created from any number of table cells, originating from any of the displayed viewers.
... Table rows can be sorted in increasing or decreasing order by clicking on column titles. The current configuration (modules, tables and viewers geometry, ...) can be saved in a file at any time, and later reused through a command line switch, thus achieving a dashboard functionality.
...) and since several modules can be loaded at once, applications for moodss become limitless. For example, comparing a remote database server CPU load and a network load from a probe on the same graph becomes possible.
...
... on my homepage at http://www.multimania.com/jfontain/
A thorough and intuitive drag'n'drop scheme is used for most viewer editing tasks: creation, modification, type mutation, destruction,
The module code is the link between the moodss core and the data to be displayed. All the specific code is kept in the module package. Since module data access is entirely customizable through C code, Tcl, HTTP,
Apart from a sample module with random data, ps, cpustats, memstats, diskstats, mounts, route, arp modules for Linux, apache and apachex modules are included (running "wish moodss ps cpustats memstats" mimics the "top" application with a graphic edge).
All the above in rpm, tgz,
Jean-Luc Fontaine
see screenshots, html documentation,
Regards.
I'm currently working on a tool that resembles what you're looking for. It consists of 3 parts and a kernel patch. The patch adds a feature to the kernel that enables a "trace" driver to register and enables key kernel parts to call upon the driver to note that a given event has occured. In turn, the trace module takes care of the events and puts them in a buffer. When a certain quantity of information is in the driver's buffer, he sends a signal to a trace daemon. The daemon then reads from the driver and appends the trace information to a trace file. The last part is the trace data decoder. This decoder takes the binary data and transforms it into a human-readable format. Therefore, impact on the system is minimized. As of this time, all the above mentionned parts are complete. The only thing that remains to be done is to build a GUI for the decoder (right now it works perfectly on command line). This is what I'm working on right now.
This system enables the observer to know exactly what happens at every moment in the system . As for remotley observing a host, this is not a problem, it is actually planned. This will consist in the trace daemon offering it's services on an IP port which can be contacted by other hosts. If you're interested ... send me an e-mail. I'll have a web page for it as soon as the GUI is complete.
Forgive me if any of these are obvious -- I'm
not trying to be sarcastic, it just that some
sys admins don't know this yet:
"top" shows you CPU load, memory usage, and
usage per process -- there are many options in
"top" check out the man page for it.
"pstree" shows a tree graph all processes and
who spawned what.
"ps auwx" (or "ps -ef" in Solaris) shows all
current processes
"netstat" and "ifconfig -a" shows network info
such as errors, dropped packets, etc.
Big Brother is a decent package for monitoring
several servers at once. It generates a web page
of colored lights (GIFs) indicating system load, web
daemon status, email daemon status, ftp daeomon
status, etc.
To join this list, send a message consisting of the single word "subscribe" (in the message body , not the subject) to:
The first objective of this list is to gather enough information to build a performance and reliability HOWTO. Many of the attendees of the BOF are on this list. This list is still in its infancy, but I'm sure that the Slashdot effect will change that!There already exists a linux performance mailing list.
/proc filesystem so let not reinvent all this stuff.
To subscribe to this list, send an email to the following address
with "subscribe" in the body of the message:
linux-perf-request@www-klinik.uni-mainz.de
My collegue, Rich Pettit has done alot of work to add perf stats to the
Seeing as someone already mentioned Datametrics
I might as well put in a (shameless) plug for our product, RAPS. Check out http://www.foglight.com for more details. Again, it's aimed at the enterprise level so it's not cheap but it has OS level monitoring as well as Sybase and Oracle agents, Netscape and Apache WWW server monitoring.
Why don't you wait for the SAR sources to be
released and adopt it. SAR supposedly gives
a lot of low level statistics. If I remember
correctly, sometime back [ok, longtime back!]
when I was writing an SNMP agent for Acer
Server Manager, we used to make use of SARs
libraries on UnixWare to get some specific
statistics for instrumentation.
-Sas
It makes me laugh to see this article appearing on the same page as this Ask /.
Although, I presume no-one's ported it to Linux yet.
FreeBSD has "perfmon", the CPU performance
:)
monitoring interface. A majority of the Linux
systems out there run on Intel CPUs. It is
not hard to implement an interface for
programming the CPU counters to measure
a variety of events. However, there needs
work to be done to use the so obtained data
in a meaningful manner. The CPU counters can
be coupled with various system tools that give
us system statistics. The bottomline:
1. There are a few drivers and libraries
for Linux that allow you to make use
of CPU counters (at least for the x86).
These don't seem to be much used.
2. Like Intel's VTune tool (which pertains
mainly to code optimization), a tool
could be written for Linux that gives
extensive performance statistics, and
helps in optimizing code. The
infrastructure for such a tool is already
there, IMHO.
3. For interested people, I could find
(from some dust-laden hard disks
pointers to related code and documents.
I worked on this subject long back.
I'm the one who posted this question [about two months ago! :O ], and have since then done some more research and asked around at the Performance BOF @ Linux Expo and didn't seem like there was anyting that providede me with everything I'm looking for. So...in the spirit of open source I've decided to write my own. The basic idea is to have an agent running on each machine you want to montor, and either a gtk or newt based UI on
the machine you're sitting at. Email me if you're interested in more info or helping.
- Deepak
Deepak Saxena
Project Director, Linux Demo Day '99
Deepak Saxena
"Computers are useless, they can only give you answers" - Picasso
maybe I'm just too tired right now, but I did not detect any heisenberg ref's in the threads I've read here on this topic. Trying to instrument a thread of execution is usually death for that thread. I despise the notion of "performance monitors". If you build it right, it will work, and any anomalies usually succumb to thoughtful analysis. Instrumenting a network at the level you guys are talking about can kill it, severely. Get ye a good sniffer, and learn to read it's utterances.
Intuitively it would seem that measuring threads in great detail would distort the measurement. However, performance registers on board CPUS (I'm thinking Intel at the moment) allow one to monitor certain aspects of threads with almost no overhead whatsoever. It's perfectly feasable and desirable to monitor code and a fine grained level.
There are many performance anomalies that can't necessarily be identified at the design level (for example pipeline flushing and cache problems). These detailed measurements can tell you much more about your program's behavior than the standard profiler can.
MRTG
UCD SNMP for Linux
MRTG is kinda a bear to work with for monitoring stuff other than a router, but it can be done. For an example you can check out my suso.org stats page. Look on the left side.