Monitoring Your Servers for Free (Part 4)

It has certainly been a month of monitoring blog posts over here! I have been running a little behind schedule thanks to this chest cold I caught, however I’m on my way to recovery and with a little coffee and determination we’ll get back to it! This post will be the epic conclusion of our monitoring your servers for free series. For this demonstration we will use the Turn Key Linux virtual appliance for Observium built on top of Debian Wheezy. If you’d like to install Observium on your RHEL or CentOS server, instructions can be found at



Download & Deploy TurnKey Linux Observium Virtual Appliance

To download the appliance visit You will have several choices as to what format you want to download the appliance in. If you are using Hyper-V I would recommend using the ISO, however since I will be using VMware ESXi 5.5 we will download the OVA template. Once it’s downloaded to your local machine follow the steps below:

  1. Unzip the OVA template
  2. Open vSphere client and click File>Deploy OVF Template
  3. In the Deploy OVF Template wizard click Browse and point to the OVF template you unzipped
  4. Proceed through the remaining steps in the wizard customizing as appropriate for your environment
  5. Power on the Turnkey VM and open the console
  6. Proceed through the setup wizard for the virtual appliance


Prepare Windows Servers for Observium

Before we can begin reporting data to Observium, we must first ensure that we have enabled SNMP. This can either be done through server manager using the add roles/features wizard. In addition you will need to ensure that the hostname the servers you’re monitoring can be resolved Observium, this may require entries into your local DNS server if DNS entries do not already exist, or entries into the /etc/hosts on the Observium server.

  1. Install SNMP (see above)
  2. Restart the Server (the SNMP options needed aren’t active until after reboot)
  3. Open Services (run command: services.msc)
  4. Double click the SNMP service
  5. Click the Security Tab
  6. Add a community name (ie: Observe)
  7. Add the IP of the Observium server to the “Accept SNMP packets from these hosts” list
  8. Open a browser and navigate to your Observium server’s web page and login
  9. Under the devices menu choose add device
  10. Provide the hostname
  11. Choose SNMP v1
  12. Enter SNMP community name set in step 6 and click add device
  13. Wait a few minutes for Observium to begin collecting data on your new server


Monitoring Linux Servers and Cisco Equipment

While I typically use New Relic for Linux monitoring, Observium has published instructions for adding Linux servers. Additionally there are also instructions available for adding Cisco switches and firewalls. Again the gotcha here is that A records must exist in local DNS or entries must be created in /etc/hosts on the Observium server, as all devices are managed by hostname not by IP.





Coming Soon!

Monitoring Your Servers for Free (Part 3)

As we have reached part 3 or 4 in our series on monitoring your servers for free, I would like to take a moment to highlight a few gotcha with Nagios before moving into our final section where we will setup graphical monitoring with Observium. At this point we’ve gone through building our a Nagios server, setting up contacts, monitoring printers, as well as monitoring both Windows and Linux servers. There are certainly a few issues you are bound to encounter on your journey with Nagios monitoring. These issues cost me hours of struggle, digging, testing, and ultimately coming upon a resolution. I would like to pay forward some of this effort in the hope that I can help save some poor soul out there the time, effort, untold amounts of silent cursing and coffee drinking.


Monitoring SQL Express

Chances are if you have been a Windows administrator for very long and have deployed and application server or two, you have undoubtedly had to setup at least one SQL Express database. The challenge of monitoring SQL Express with Nagios is that our friends at Microsoft have created a service name that contains a dollar sign (MSSQL$SQLEXPRESS). The challenge is that the $ character must be escaped using $ in order for Nagios to correctly read the service name. Use the following service definition below as reference for making SQL Express play nice with Nagios.

define service{

use generic-service


service_description Service: SQL Server – SQLEXPRESS



Nagios localhost Warnings for HTTP

On your Nagios server you may encounter a yellow warning message regarding the HTTP service. The reason for this is that the NRPE client running on Linux servers is checking to ensure that apache is running and that it can locate and index file within the default docroot for apache. Resolving this issue on your Nagios server can be as simple as creating a blank file called index.html in the /var/www/html directory. A simple way to accomplish this is to use the touch command (ex touch /var/www/html/index.html

Once you create this dummy file you will need to restart apache (service httpd restart on RHEL or service apache2 restart on Debian variants) as well as the nagios service (service nagios restart).


Nagios Error When Rescheduling a Service Check (Error: Could not open command file ‘usr/local/nagios/var/rw/nagios.cmd’ for update!)

This and the SQL Express issue may very well be the most aggravating Nagios issues I encountered in my first deployment. However through being highly caffeinated and stubborn I did eventually find a solution. The root of the problem is that there is a permission setting that is getting flipped each time the Nagios service restarts. I have seen a few different fixes for this issue that both work. The first method I have seen corrects the problem at its root by correcting the broken permissions. However if this doesn’t resolve the issue for you, another alternative is to use a script that resets the permission each time the Nagios service runs. I recommend attempting the first method first as it is the more preferable fix, however if this doesn’t resolve your issue, try method two. Please note, terminal commands are in italics

Method 1:

#usermod -G nagios apache

#grep nagios /etc/group (ensure that the result shows that nagios is part of the apache group)

#service httpd restart (substitute apache2 instead of httpd on Debian based variants)


Method 2:

Alex Nogard’s blog lays out the methodology for creating a script that fixes the permission each time Nagios starts by adding the script into init.d/nagios. Please follow the link below for his instructions:


Automating NRPE Agent Deployment Through Puppet

Although outside the scope of this particular discussion, one way to automate deployment of NRPE throughout multiple web servers is to use puppet to facilitate this. I will perhaps visit this topic in later posts regarding Puppet Labs, however I will sum this point up quickly in a nutshell before we wrap up. If you have an existing Puppet infrastructure, you can simply have Puppet add the EPEL repo to each LAMP server, create an ensure installed statement to ensure that NRPE is installed, and finally push out a preconfigured nrpe.cfg that contains the correct server information for your Nagios server (located in /etc/nagios).


That brings me to the end of part 3 in our series on monitoring your servers for free. In the next installment, we’ll take a look at deploying Observium to give us graphical output for our servers. Til next time, may the coffee be endless and the uptime in your favor!


Monitoring Your Servers for Free (Part 2)

In Part 1 of our discussion, we covered how install Nagios Core 4 from source on Centos 7. Now that we have a Nagios server up and running, it’s time to begin monitoring things. To begin with, lets cover agent vs agentless Nagios monitoring. If you would like to perform simply checks such as determining if a switch or printer is responding to ping, you can setup basic ping monitoring without configuring an agent . As long as our Nagios server is able to reach the device and the device has ICMP enabled, everything will be happy. This type of monitoring can also be used to perform checks on website access, DNS resolving to expected locations, and SSL check (however that is not covered in this post). To monitor uptime, resource utilizaition (HDD space, CPU, Memory, etc), and specific services, we will need the Nagios agent. The agent is available in both Linux & Windows flavors. The Nagios agent for Linux is called NRPE and can be installed using apt-get or yum (ex; yum install nrpe). On the Windows side, the Nagios agent is NSClient++ and can be downloaded as an executable, see the download location:

Nagios Server Side Configuration

Nagios retrieves its monitoring configuration from the nagios.cfg file located in /usr/local/nagios/etc/nagios. The the nagios.cfg file contains definitions that point to templates that define what is being monitored. When creating new templates, it is important to remember to go back and add the nagios.cfg entry corresponding to the new template (or to uncomment one of the default templates if you choose to use one).

The other locations of interest we will take a look at are the templates located in /usr/local/nagios/etc/objects. If you choose to modify the existing templates rather than creating new ones, I would recommend making a backup copy of these templates to keep just in case you ever need to restore them or refer back to them (ex: cp windows.cfg windows.cfg.bak) One of the first templates we will want to modify is the contacts.cfg, this is where you can add the email address (or distribution list) that you want to receive nagios alerts. For example if I want to receive Nagios alerts at I can set this within the contacts.cfg. The final thing we need to discuss before we begin walking through some actual setups, is that Nagios communicates on port 5666, you will need to ensure that the server you are monitoring have the ability to communicate with your Nagios server on port 5666, and that port 5666 is open on your Nagios server (see example below):

iptables -A INPUT -p tcp -m tcp –dport 5666 -j ACCEPT

Fortunately the NSClient++ agent makes the needed provisions in the Windows firewall on install, however you will still want to be aware of this if hardware firewalls or AWS security rules are between your Nagios server and the infrastructure that it is monitoring.

Monitoring Printers

Let’s go ahead and setup some basic print monitoring now that we have an overview of the basics. For this example I am going to use the default network printer template provided by Nagios. In my examples I am using nano as the simple text editor, depending on your installation of Centos you may or may not have this editor by default, however you can use vi, vim, or any other Linux text editor, or install nano (yum install nano).

  1. cd /usr/local/nagios/etc/objects
  2. cp printer.cfg printer.cfg.bak
  3. nano printer.cfg
  4. At this point we can now create our host definitions, in the example below, you will see my entries for 2 printers. Customize your host definitions according to your environment, you can simply copy and paste to add more host definitions (customizing them with the appropriate info).     host def(click image to zoom)
  5. Customize the host group if desired, I have left this default since I am only manage the printers for one site
  6. In the service definitions section you will want to replace the dummy host_name with the hostnames defined in your host definitions. To list multiple, you can type them on the same line separated by commas (ex: host_name      IT_ColorLaser,Lobby_Copier). For my monitoring purpose I only care if the printers are on the network, so I am only monitoring ping. Any unused service definitions must be commented out or deleted.
  7. nano /usr/local/etc/nagios.cfg
  8. remove the # in front of cfg file path for the printers hostgroup (see example below)
    printercfg(click image to zoom)
  9. At this point we can save and exit nagios.cfg and restart nagios (systemctl restart nagios.service)
  10. If we’ve done everything right at this point the web page should now have an additional host group for network printers that are lit up green


Monitoring Windows Servers

Now that we’ve successfully gone through setting up printer monitoring, lets get started with Windows Server monitoring. For larger organizations, there will be large groups of servers configured with similar roles, features, and uses. If you’re IT environment is like most SMB shops, you may only have 1-2 Windows Servers configured the same way, and these are likely to be Domain Controllers. You may choose to setup all of your Windows Servers under a single Nagios template and associating host definitions only with the services you want monitored, or you may choose to create multiple templates based on function, so instead of everything being under windows-servers,  you may have windows-domaincontrollers, windows-webservers, etc. I have my production environment setup using the latter method, however for the sake of simplicity and minimizing config sprawl, we will add a couple of servers into monitoring. The servers I have chosen are a 2012 Domain controller, and a 2008 R2 server configured with IIS.

  1. Prepare your Windows servers by downloading and installing the NSClient++ available at
  2. Proceed through the wizard entering the IP address of your Nagios server, no password, and check the first 3 boxes see below:ns++wizard(click image to zoom)
  3. Now that the client is prepared, we can ssh into our Nagios server and complete the configuration on the Nagios side
  4. cd /usr/local/nagios/etc/objects
  5. cp windows.cfg windows.cfg.bak
  6. nano windows.cfg
  7. Edit the host configuration to contain the information for your servers. See the example screenshot below of the 2 servers configured for this demohost def(click image to zoom)
  8. add the hostnames defined in the host configuration to the service definitions (note: on services such as W3SVC that apply only to one server, be sure to only include the hostname of the server it is applicable to).
  9. To monitor additional services you will need to create service definitions for them. The easiest way is to copy an existing service definition and customize it with the service you wish to monitor. I have done this in our example by copying the W3SVC service and using it as a template for our DNS Server and AD services. To find service names on your windows server you will want to go to services.msc, locate the service, right click and make note of the service name and display name.custsvc(click image to zoom)
  10. In Server 2008 R2 and later ICMP ping is turned off by default. This will result in a false positive for host down when monitored by Nagios. To enable ICMP open cmd or Powershell and type the following and press enter: netsh firewall set icmp 8
  11. nano /usr/local/nagios/etc/nagios.cfg and uncomment out the cfg file path for windows.cfg
  12. systemctl restart nagios
  13. At this point if we are successful we should see a host group with each server and its services listed

Monitoring Linux Servers

To monitor a Linux server the process is somewhat simple. In the environment I support, our most common use case for Linux is as a web server for our web based SaaS application. Being that nearly every Linux server in our environment is configured the exact same way, the use of a single template with multiple hosts is extremely applicable. If you are installing on Centos you will need to enable the EPEL repo or obtain the NRPE plugin through wget. For more info about EPEL visit

  1. yum install nrpe
  2. nano /etc/nagios/nrpe.cfg
  3. Locate the allowed_hosts portion of the nrpe.cfg
  4. Add a comma after the localhost address, add a space then type the IP address of your Nagios server (ex: allowed_hosts=,
  5. save and exit
  6. chkconfig nrpe on
  7. Now that we’ve configured out client, lets hop back over to the Nagios server
  8. cd /usr/local/nagios/etc/objects
  9. cp localhost.cfg linux.cfg
  10. nano linux.cfg
  11. Replace localhost in the host definitions with the name and IP of the web server(s) you are monitoring
  12. Change the hostgroup name to something else (ex: linux-webservers) as well as the alias (ex: Linux Web Servers)
  13. Replace all instances of localhost in the service definitions with the hostname of your web server(s) as defined in the host definitions
  14. Save and exit
  15. nano /usr/local/nagios/etc/nagios.cfg
  16. Copy and paste the definition for localhost and modify the description and path to correspond to the linux.cfg object
  17. systemctl nagios restart

If all has gone well at this point we should see another host group containing our web server(s).


Printer Monitoring Video

Windows Server Monitoring


More coming soon!



Thanks for sticking with me, I know this has been a long post. Hopefully this will help you to get your own free alert monitoring going with Nagios in your environment.