Most changes to our Nagios server configuration are confidential, so they are not described in this public area.
Here you will find only a few examples; some values are anonymized.
Check HTTP: http://linux.101hacks.com/unix/check-http/
Our changes were:
- Include files.
- We use the directory search capability of the Nagios configuration which includes all files in the directory specified.
- We commented out the reference to localhost.conf.
- Admin mail adress (admin_email and admin_pager - though they are not used).
- The global date format (date_format).
- refresh_rate=60 # default is 90 seconds. Determines how often the web pages are refreshed.
The first block of changes relates to sending SMS with sendmessage.pl:
To monitor a remote node with NRPE, another command has to be defined:
Give the Nagios admin a real mail address:
Set up individual users:
Set up groups for ourselves and our customers:
Some new templates are currently just placeholders; they make it easier to apply company-specific changes at a later point in time.
Host templates for own servers:
Host templates for customers:
Define service groups for easier overviews in the web interface:
And now the service templates. They are mainly used to shorten and standardize the actual service definitions.
First a subclass of the generic service to set the default contact group:
Now the OS level services. These and all others use NRPE by default; to test the local host, the commands will be given in services.cfg (see below).
The next groups of service definitions are used to check databases. They refer to commands described in other articles of this How-To group (individual links to be supplied).
Here are the HTTP checks; they are run directly from the server, not through NRPE:
And finally the simple SMTP check:
Here we define host groups to improve the overviews in the web interface:
In the subdirectory
/usr/local/nagios/etc/objects/hosts we have one file per host we want to monitor. Each file contains one
host and several
To avoid DNS lookup problems, I use IP addresses only.
Here I give a few examples only:
This is the Nagios server itself - i.e. localhost.
wiki.gutzmann.com is our wiki server; we have set up Confluence with Postgres, so I include the Postgres checks as well.
Test the configuration
You can use the service command to check the configuration before you restart Nagios:
service nagios checkconfig
The error messages can be found in /usr/local/nagios/var/nagios.log
But it's much more convenient to use the nagios command, because you see the errors immediatly:
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
Restart Nagios and check for errors
If you have modified files in /usr/local/nagios/etc (especially nagios.cfg), you must restart Nagios:
service nagios restart
If you have changes any other configuration files in /usr/local/nagios/etc, you should use "reload":
service nagios reload
Error messages can be found in
Instead of reading or tailing the files in the console, you might better use the web interface (as long as Nagios did restart).
Add Administrator Users