Nagios is a watchdog daemon that can be configured to check on various properties of a computer system and issue warnings and/or alarms if one of the properties goes out of spec. Among the things it can check for are disk free space, CPU load, ability to ping other hosts, etc.
Nagios is installed in
. The main configuration files are located below that directory in
The server-1 nagios daemon can also run checks on other hosts via its
service. The SWC firewall allows the "nrpe" service through its firewall. Some fairly simple checks are run on each host. Each SWC runs a
service which executes requests from the nagios daemon on the server.
Most nagios alerts will periodically reissue an alert until the underlying condition is fixed. For a problem that cannot be immediately fixed this can generate a lot of annoying spam. To fix this problem, it is possible to acknowledge the problem which will cause nagios to refrain from issuing any more email messages until the problem is fixed.
A set of scripts to assist in acknowledging problems can be found in
and start with "nagios". The alert email will provide the name of the host and the name of the service. It is possible to acknowledge all alerts for a given host (
) or for a particular service on a host (
). It usually generates a final "alert" email to state that the problem has been acknowledged.