Monday, August 2, 2010

Nagios


Nagios is a system for monitoring a network. Monitors hosts and services you specify, alerting when the network behavior is not desired, and again when it returns to its rightful state. Nagios is licensed under the GNU General Public License Version 2 as published by the Free Software Foundation.

Features:

Monitoring of network services (SMTP, POP3, HTTP, NTTP, ICMP, SNMP).
Monitoring of host resources (processor load, disk usage, system logs).
Simple design of plugins that allow users to develop their own service checks depending on your needs, using their favorite tools (Bash, C + +, Perl, Ruby, Python, PHP, C #, etc.)..
Ability to define the hierarchy of the network, allowing to differentiate between host and host fallen inaccessible.
Notifications to the contacts when problems occur in services or hosts, and when they are resolved (via email, pager (search), SMS, or any method defined by the user along with its corresponding complement).
Ability to define event handlers that run when an event occurs or host a service for proactive problem-solving.
Support for implementing redundant hosts monitors.

Installation Requirements:
Apache HTTP Server
GCC Compiler
GD Library

To install these units, run as root:

[Root @ golden ~] # yum-y install httpd gcc glibc-common glibc-devel gd gd

Create an account for Nagios:

[Root @ golden ~] # useradd nagios
[Root @ golden ~] # passwd nagios

Create the group nagcmd to allow external commands can be submitted via the web interface. Add nagios and apache users group.

[Root @ golden ~] # groupadd nagcmd
[Root @ golden ~] # usermod-G nagios nagcmd
[Root @ golden ~] # usermod-G apache nagcmd

Download Nagios and the Plugins:

[Franky @ golden ~] $ mkdir nagios
[Franky @ golden ~] $ cd nagios
[Franky @ golden nagios] $ wget http://ufpr.dl.sourceforge.net/sourceforge/nagios/nagios-3.0.1.tar.gz
[Franky @ golden nagios] $ tar zxf nagios-3.0.1.tar.gz
[Franky @ golden nagios] $ wget http://ufpr.dl.sourceforge.net/sourceforge/nagiosplug/nagios-plugins-1.4.11.tar.gz
[Franky @ golden nagios] $ tar zxf nagios-plugin-1.4.11.tar.gz

Building and Installing Nagios

Become root:

[Franky @ golden nagios] $ su
[Root @ golden nagios] # cd nagios-3.0.1

Configuring Nagios:

[Root @ golden nagios-3.0.1] #. / Configure - with-command-group = nagcmd - with-init-dir = / etc / init.d

Compiling Nagios:

[Root @ golden nagios-3.0.1] # make all

Installed binaries, init script (/ etc / init.d), configuration examples:
[Root @ golden nagios-3.0.1] # make install
[Root @ golden nagios-3.0.1] # make install-init
[Root @ golden nagios-3.0.1] # make install-config
[Root @ golden nagios-3.0.1] # make install-commandmode


Configuring the web interface:

Installing configuration files in the conf.d folder of Apache configuration:

[Root @ golden nagios-3.0.1] # make install-webconf
Nagiosadmin Create an account to access the web interface, which then needed for Web access to Nagios:

[Root @ golden nagios-3.0.1] # htpasswd-c / usr / local / nagios / etc / htpasswd.users nagiosadmin
New password:
Re-type new password:
Adding password for user nagiosadmin

Restart Apache for the new settings to take effect:

[Root @ golden nagios-3.0.1] # / etc / init.d / httpd restart


Compiling and installing plugins

[Root @ golden nagios] # tar zxvf nagios-plugins-1.4.11.tar.gz
[Root @ nagios-plugins-golden 04.01.1911] #. / Configure - with-nagios-user = nagios-with-nagios-group = nagios
[Root @ nagios-plugins-golden 04.01.1911] # make
[Root @ nagios-plugins-golden 04.01.1911] # make install

Configuration

Adding check_nrpe command (used below) in the file / usr / local / nagios / etc / commands.cfg:

define command (
command_name check_nrpe
command_line $ USER1 $ / HOSTADDRESS check_nrpe-H $ ARG1 $-c $ $
)

Add configuration files (will be published later) of the hosts, host groups, services and service groups in the file nagios.cfg:

cfg_file = / usr / local / nagios / etc / objects / hostgroups.cfg
cfg_file = / usr / local / nagios / etc / objects / hosts_fcld.cfg
cfg_file = / usr / local / nagios / etc / objects / services_fcld.cfg
cfg_file = / usr / local / nagios / etc / objects / servicegroups.cfg

Description of these files:

hostgroups.cfg: definition of groups of hosts.
hosts_fcld.cfg: defining the hosts of the Sun Open Source Foundation
services_fcld.cfg: service definition to monitors in the FCLD hosts.

servicegroups.cfg: definition of the service clusters of servers.

Defining the hosts in / usr / local / nagios / etc / hosts_fcld.cfg:
define host (
fcld name-server # Template Name
check_command check-host-alive # Command to see if the host is reachable
check_period 24x7 # Period of time for monitoring. 24 hours, 7 days a week.
max_check_attempts 3 # Maximum attempts to check
# Interval normal_check_interval normal monitoring
process_perf_data 0 # Process performance information
retain_nonstatus_information 0 # retain information from non-state travez the program restarts
sysadm contact_groups # Group to which notifications will be sent
30 # Forward notification_interval notifications every 60 minutes
24x7 # notification_period reporting period
notification_options d, u, r # Send notifications d = down, u = up r = recovery
register 0 # Do not register this host, only a template
)
define host (
fcld-server use
sanjuan.fcld.local host_name
San Juan alias
sanjuan.fcld.local address
)

Defining the services in / usr / local / nagios / etc / services_fcld.cfg:

define service (
fcld-service # name Name of template
use generic-service # Template base
# check_period 24x7 monitoring period
max_check_attempts 3 # Maximos attempts to check
# Interval normal_check_interval normal checkups. 3 minutes
retry_check_interval 1 # Retry in 1 minute checking
sysadm contact_groups # Group of contacts to notify
notification_options w, u, c, r # Options for notifications:
w = warning, u = unknown, r = c = critical recovery
30 # Interval notification_interval re-notifications
24x7 # Period notification_period notifications. 24 hours a day, 7 days a week
register 0 # Do not register this service. It is only a template.
)

define service (
fcld-service use
sanjuan.fcld.local host_name
service_description Processes
check_command check_nrpe! check_total_procs
)
define service (
fcld-service use
sanjuan.fcld.local host_name
service_description zombie processes
check_command check_nrpe! check_zombie_procs
)
define service (
fcld-service use
sanjuan.fcld.local host_name
Swap service_description
check_command check_nrpe! check_swap
)
define service (
fcld-service use
sanjuan.fcld.local host_name
Users service_description
check_command check_nrpe! check_users
)
define service (
fcld-service use
sanjuan.fcld.local host_name
System head service_description
check_command check_nrpe! check_load
)
define service (
fcld-service use
sanjuan.fcld.local host_name
service_description Disc /
check_command check_nrpe! check_disk
)
define service (
fcld-service use
sanjuan.fcld.local host_name
service_description Disco / boot
check_command check_nrpe! check_disk_boot
)
define service (
fcld-service use
santodomingo.fcld.local host_name
HTTP service_description
check_http check_command
)
define service (
fcld-service use
sanjuan.fcld.local host_name
SSH service_description
check_ssh check_command
)

Defining the contacts in / usr / local / nagios / etc / objects / contacts.cfg:

define contact (
# use generic-contact template base
franky Contact_Name # Name of contact within Nagios
# Aliases alias Franky Almonte, detailed name
email suresh.sonu2@gmail.com # Contact E-Mail
24x7 # Period service_notification_period notifications for services
24x7 # Period host_notification_period notifications for hosts
service_notification_options w, u, c, r # Report the status of services: w = warning, u = unknown, r = c = critical recovery
host_notification_options d, u, r # Report the status of hosts: d = down, u = unreachable r = recovery (up states)
service_notification_commands notify-by-email # Nitifica using the command notify-by-email: send notifications by e-mail services
host_notification_commands host-notify-by-email # Send notifications by email hosts
)
define contact (
use generic-contact
Contact_Name cristhian
alias Cristhian Nunez
email suresh.sonu2@gmail.com
24x7 service_notification_period
24x7 host_notification_period
service_notification_options w, u, c, r
host_notification_options d, u, r
service_notification_commands notify-by-email
host_notification_commands host-notify-by-email
)

Defining groups of contacts in / usr / local / nagios / etc / contacts.cfg:

(define contactgroup
sysadm contactgroup_name # Name of group
# Aliases alias FCLD SYSADM group
Members franky, cristhian # Members of the group
)

Defining the group of hosts under / usr / local / nagios / etc / objects / hostgroups.cfg:

(define hostgroup
hostgroup_name Group # Name servers
Servers FCLD # Alias alias group
Members localhost, sanjuan.fcld.local, santodomingo.fcld.local # Members. The names are separated by commas.
)



Defining services group in / usr / local / nagios / etc / objetcs / servicegroups.cfg:

(define ServiceGroup
SSH servicegroup_name # Name Service Group
Secure Shell # Aliases alias group
Members sanjuan.fcld.local, SSH, santodomingo.fcld.local, SSH # Members. Format: host name, service name
)

Start Nagios

Nagios add to the list of system services that start automatically when the system boots:

[Root @ golden nagios] # chkconfig - add nagios
[Root @ golden nagios] # chkconfig nagios on

To verify the settings before you begin:
[Root @ golden ~] # / usr / local / nagios / bin / nagios-v / usr / local / nagios / etc / nagios.cfg


If no errors:

[Root @ golden ~] # / etc / init.d / nagios start

Starting with the command Nagios service:
[Root @ golden ~] # service nagios start


At this point we have installed Nagios! But only contains a basic configuration to monitor essential services of the computer running.

To access via web Nagios type the following URL into your browser:
http://nombre.o.ip.del.servidor/nagios

Use the following username and password:

Username: nagiosadmin
Key: nagiosadmin
Figure 1: Accessing Nagios


Figure 2: Home Nagios


Figure 3: Services installed by default for localhost

Monitoring equipment GNU / Linux

To monitor equipment GNU / Linux need to run remote commands on those machines remotely. This is possible thanks to the NRPE service. Through this service we can run Nagios command will give us the information we need on the CPU, memory, disks, etc.

The monitoring architecture using NRPE is:



Installing NRPE

Create a user for NRPE:

[Root @ golden ~] # useradd nagios
[Root @ golden ~] # passwd nagios

Installing plugins:

[Root @ sanjuan nagios] # wget http://ufpr.dl.sourceforge.net/sourceforge/nagiosplug/nagios-plugins-1.4.12.tar.g
[Root @ sanjuan nagios] # tar zxvf nagios-plugins-1.4.12.tar.gz
[Root @ sanjuan nagios] # cd nagios-plugins-01.04.1912 /
[Root @ nagios-plugins-sanjuan 01/04/1912] #. / Configure
[Root @ nagios-plugins-sanjuan 01/04/1912] # make
[Root @ nagios-plugins-sanjuan 01/04/1912] # make install
[Root @ nagios-plugins-sanjuan 01/04/1912] # chown nagios: nagios / usr / local / nagios /
[Root @ nagios-plugins-sanjuan 01/04/1912] # chown-R nagios: nagios / usr / local / nagios / libexec /

Installing xinetd:

[Root @ sanjuan nagios] # yum-y install xinetd

Download and install NRPE:

[Root @ sanjuan nagios] # wget http://ufpr.dl.sourceforge.net/sourceforge/nagios/nrpe-2.12.tar.gz
[Root @ sanjuan nagios] # tar zxvf nrpe-2.12.tar.gz
[Root @ sanjuan nagios] # cd nrpe-2.12 /
[Root @ sanjuan nrpe-2.12] #. / Configure - disable-ssl
[Root @ sanjuan nrpe-2.12] # make all
[Root @ sanjuan nrpe-2.12] # make install-plugin
[Root @ sanjuan nrpe-2.12] # make install-daemon
[Root @ sanjuan nrpe-2.12] # make install-daemon-config
[Root @ sanjuan nrpe-2.12] # make install-xinetd


Adding NRPE service to the list of system services in / etc / services:

[Root @ sanjuan nrpe-2.12] # vim / etc / services

Add at end of file:
nrpe 5666/tcp # NRPE


We add the Nagios server IP address in the configuration of xinetd to allow access to the service team via NRPE:

[Root @ sanjuan nrpe-2.12] # vim / etc / xinetd.d / nrpe
service nrpe
(
flags = REUSE
socket_type = stream
port = 5666
wait = no
user = nagios
group = nagios
server = / usr / local / nagios / bin / nrpe
server_args =-c / usr / local / nagios / etc / nrpe.cfg - inetd
log_on_failure + = USERID
disable = no
only_from = 127.0.0.1 # IP 192.168.100.12 Nagios server
)

Xinetd service must start with system startup teams GNU / Linux remote for this run:

[Root @ santiago objects] # chkconfig - level 345 xinetd on

Check_nrpe command syntax is:
check_nrpe-H [-n] [-u] [-p ] [-t