Archive for October, 2009

Nagios: Creating a Visual Map from Nagios data

Saturday, October 31st, 2009

I can see there is a lot of demand for my Nagios Small Administrator Guide. That is great and I hope it is helping people. I am currently involved with a project to implement Nagios into an environment to monitor retail stores on a nationwide basis, over 400 stores.

What I am going to do for the nagios community is put together a zip file with the directories and files required to accomplish this task. You will need to install Nagios, NDODB, and Nagvis for this project as well as have MySQL installed. I have installed all those products before and they are not difficult if you follow the directions provided.

When you have implemented the project within your environment you will have a diagram like this one:

US Map of Stores

Attached you will find the project. Many consulting companies would charge thousands of dollars for creating this map. All I ask is that you check out some of the sponsors on this site and their products. The best source for designing an creating objects within Nagios is:
Nagios Map.ZIP

Nagios: Suggestion for Monitoring NFS

Saturday, October 31st, 2009

Networked file system (NFS) is very complex to both monitor and send notifications. System Administrators are concerned with the network, cluster, server
hardware and software and managing the file system. End user concerns are drive space and availability. Nagios® provides the perfect solution to monitoring a NFS.
Scenario

Upper management is concerned about the NFS. End users are complaining about not having access to or the ability to save files to a shared drive. How can
Nagios® save-the-day once again?
Solution

  1. Identify how the system works. Create a diagram on a white-board and
    encourage input from all administrators. The diagram needs to include all hosts, host name aliases, special software, cnames, file names and at least one switch level above the hosts.
  2. Using a diagramming tool, like DIA, transfer the white-board diagram to a
    file (the diagram is required later).
  3. Identify objects. Nagios® has two types of objects: hosts and services. Write down all the objects. Host objects should include: switches, servers and host
    aliases. Services should include all service checks for production hosts (template), PING for switches (template) and your NFS disk size checks(new template). Establish your parent/child relationships.
  4. Identify the best plugins for monitoring your objects. Monitoring a host’s
    \nUP/DOWN state should be with a service check, like check_users. Monitoring NFS disk sizes would be check_disk.
  5. Create definitions. Nagios® needs configurations in order to start working with the NFS. The discussion will be limited to the NFS aspects, since many of the
    service checks are standard.

    1. Create an NFS hostgroup. Defining the hostgroup will become apparent if you ever need to move servers in and out of the NFS group.
      define hostgroup{
      hostgroup_name		nfs-prod
      alias			Check NFS Mounts
      }
    2. Place all NFS hosts under a distinct host file. You may want to call the file, prod_nfs.cfg. Now, define the hosts in the file:
      define host{
      use			prod-nfs-servers            ; Name of host template to use\n
      host_name		nfs_services
      alias			nfs_services
      notes			Cluster Heartbeat LOCATION: DC9-M09 CONSOLE: nfs_servicesmgmt
      icon_image		officialpenguin.gif
      icon_image_alt		Linux Host
      notification_period	24x7
      address			146.87.90.4
      hostgroups		prod_nfs
      parents			nfs_alias, switch-one-level-up
      }
    3. Create the command. Create distinct command names for each disk, so you can adjust the thresholds and contacts later. If only one person is responsible for maintaining all the NFS, then only one command will suffice.
      define command{
      command_name		check_nfs_edient
      command_line		$USER1$/check_disk
      }
    4. Create the service. Create distinct services if more than one person or group is responsible for maintaining the disks.
      define service{
      use			nfs-service
      hosts			nfs_server
      notes			Place notes about the NFS disk in this location, like the name your end users know the disk. You will see where this comes in handy later.
      service_description	check_edient
      check_command		check_nrpe!check_nfs_edient
      max_check_attempts	3
      normal_check_interval	5
      retry_check_interval	1
      check_period		24x7
      contact_groups		Name of group responsible for this disk
      }
    5. Create contact groups and contacts. Defining the contact group and contacts within the same”.cfg” file makes maintenance easier.
      define contactgroup{
      contactgroup_name	edient_admins
      alias			EDIENT ADMINS
      members			edient_admins
      }
      
      define contact{
      contact_name		edient_admins
      alias			EDIENT Admins
      contact_groups		edient_admins
      host_notifications_enabled      0
      service_notifications_enabled   1
      service_notification_period     24x7
      host_notification_period        24x7
      service_notification_options    w,u,c,r
      host_notification_options       d,r
      service_notification_commands   notify-nfs-service-by-email
      host_notification_commands      notify-linux-host-by-email
      email                           define individual email addresses here.
      }
    6. Create custom notification. Do not expect end users to interpret the notification a System Administrator receives in the same light. Format the email so he or she is able to understand the problem and take corrective actions. I created a custom email command for NFS end user notifications.
      define command{
      command_name		notify-nfs-service-by-email
      command_line		/usr/bin/printf "%b" "You have been identified as part of the group that has access to this NFS mount. Please remove any unneeded files in order to reduce the size of this directory.\n\nService:tt$SERVICEDESC$nState:\t\t\t$SERVICESTATE$\nDate/Time:\t\t$LONGDATETIME$\nAdditional Info:t$SERVICEOUTPUT$ \n\nThis directory may also be known as $SERVICENOTES$" | /usr/bin/mail
      x -s "$HOSTALIAS$:$SERVICESTATE$ - $NOTIFICATIONTYPE$ Service Alert: $HOSTALIAS$
      $SERVICEDESC$" $CONTACTEMAIL$
      }

      #NOTE: The difference here is the defined MACRO”$SERVICENOTES$.” This is going to tell the end users what the problem is without using technical terms.

  6. Test Nagios® configuration. Ensure no mistakes exist within your object definitions: $PREFIX/bin/nagios -v $PREFIX/etc/nagios.cfg. Reload Nagios®:
    /etc/init.d/nagios reload. Lastly test the email notification by placing your email address in the”email” list for the contacts. Then within the Nagios
    web console locate the host and services you just created. Click on the service -
    “send custom service notification” – check the “force” box and add a comment – commit. Check the email you received to ensure correct formatting, then click on the “Notifications” section of the web console and make sure the email was only sent to the contact group you stipulated.

NFS can be complicated and time consuming to establish monitoring and notification. However, if you take the time to plan properly how you want to accomplish the monitoring and notifications it becomes a simpler task. I hope this helps with your Nagios®instance.

Thoughts or Questions? Please share your NFS experiences with our community.