Archive for the ‘Nagios’ Category

Nagios: ndoutils giving problem with remote database connection

Thursday, December 3rd, 2009

I am in the middle of building a green Nagios environment. That means I am deploying virtual servers to do the work. That platform is SLES 10 SP3 x64 on an IBM x3550, set for full virtulization in the BIOS. Our environment is so large, 2000+ hosts and 3000+ services that we broke Nagios into two sections. The two sections are: retail and corporate.

Breaking the areas down required two virtual hosts. A third virtual host was introduced to run the databases. When I said databases, I created two nagios databases to separate the data from both areas. The problem I encounted was with ndoutils. It just loved not to work properly at first, then I took some steps and had the data writing to a remote database.

First I compiled ndoutils-1.4b7. I gave the following arguments:

sudo ./configure \
--prefix=/opt/nagios \
--with-mysql-lib=/usr/lib64/mysql \
--with-mysql-inc=/usr/include/mysql \ #You will have to install mysql-devel

After that finished I performed a sudo make, then copied the files to their respective directories as indicated in the README file.

ndo2db.cfg

  1. I created the var directory under my Nagios install and changed ownership to nagios:nagcmd.
  2. I changed the paths for the various sockets to my Nagios installation path.
  3. Changed the db_name to whatever the database is called.
  4. Changed the db_host to the FQDN of the virtual database server.
  5. Set the db_user and db_password up for the nagios user.
  6. Saved my changes and closed the file.

ndomod.cfg

  1. I just changed the path of the files to my nagios installation path.
  2. Saved the changes and closed the file.

nagios.cfg

  1. Since I am using Nagios 3, all I had to do was uncomment the broker agent.
  2. I saved the changes and closed the file.

Here is where the fun comes in to play. I was experiencing a problem with ndomod being unable to establish a data sync. So I went to the database host and performed the following functions:

  1. I untarred the ndoutils package on the database server.
  2. I logged into mysql and created the databases.
  3. I modified the mysql.sql file in the ndoutils package to reflect the database name.
  4. Then I ran sudo mysql < mysql.sql. That created the tables within each database.
  5. I then set up the nagios account in mysql using mysql_setpermissions. I gave nagios access to all the databases from anywhere and set the password I established in the ndo2db.cfg file. That is very lax on my part, but the only one using the databases will be the nagios user.

I then started nagios and ndomod on both the virtual hosts. Then performed a netstat -a and was able to see the sockets connecting to my database server. I then performed a mysql query on the database server with a select * from nagios_objects; and saw the tables being populated on both databases.

I hope this helps someone else, because I wrapped my head around this for two days before stumbling on this solution.

Nagios: service and host escalations made simple

Friday, November 6th, 2009

You are asked to escalate a down host or service to either another technical level or add an incident to your Change Management System. What do you do? Nagios has a great object attribute called escalations for you to configure for the purpose of escalating host and or service issues.

If you already have an existing Nagios host and service monitoring and notification system established, you will be up and running in two steps:

  1. Add any new contacts and or contact groups.
  2. Add the escalation configuration.

I will explain how I added an escalation to notify a Change Management System on the first notification. Once the Change Request is sent you will no longer have to create any more problem tickets, since the technician working the problem should do two things: Acknowledge the problem through Nagios and work the problem ticket.

  1. Add any new contacts for the escalation. I had to add a contact for the Change Management System, since we are able to open an incident using email.
  2. name-of-your-contact-file.cfg

    define contactgroup{
    	contactgroup_name		name of your cms
    	alias				Name of Your CMS
    	members				name_of_your_cms
    	}
    define contact{
            contact_name                    name_of_your_cms
            alias                           Name_of_Your_CMS
    	contact_groups			name_of_your_cms
            host_notifications_enabled      1
            service_notifications_enabled   1
            service_notification_period     24x7
            host_notification_period        24x7
            service_notification_options    c #We only need a problem ticket open when the service is critical
            host_notification_options       d #We only need a problem ticket open when the host is down
            service_notification_commands   notify-linux-service-by-email
            host_notification_commands      notify-linux-host-by-email
           email                           email_address@your_domain.com
            can_submit_commands             1
            }
  3. Add the escalation.
  4. Name_of_your_escalation.cfg

     define hostescalation{
     	hostgroup_name		name_of_your_hostgroup
     	first_notification	1
     	last_notification	1
     	notification_interval	5
     	contact_groups		name_of_your_cms_group
    	escalation_period	timeperiod_to_notify_create_incident #ex. 24x7
    	escalation_options      d #We want a problem ticket created when the host is down.
     	}

Now, you need to remember how a service escalation works. The escalation is read into Nagios during the reload. When the escalation definition completes, Nagios is smart enough to start with the notification attributes defined in your host and or service template. So, in this instance Nagios will perform the escalation definition and notify your Change Management System once (1). Nagios will continue notifications based upon your notification definitions within the template.

Like everything in life there is a catch. If you create an escalation with a contact group defined within your template, Nagios will only execute the escalation file. For example, if you have a tech_email contact group in the host or service definition and you add it to the CMS contact group within the escalation template both groups will only be notified once (1) when a host or service is down or critical.

That is all there is to creating an escalation for a single purpose. Now, reload Nagios and the escalation will take effect. For service escalations you will be adding a service_definition. Play around with your escalations until you have the correct combination of attributes.

I recommend you add your email address to both groups and reload Nagios. Then select the Notification item in your web-based application. The notification section will show you the contact groups being notified. What I do is add a boggus host definition, add the host group to the escalations file, then reload Nagios.

Have fun and leave a comment if you have any questions or other suggestions for host and service escalations in Nagios.