一、安装步骤
1、Ndoutils安装
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
|
cd /home/taolei/copy/pkg/
tar zxvf ndoutils-2.0.0.tar.gz
cd ndoutils-2.0.0
yum install mysql-devel
ln -sf /var/lib/mysql/include/* /usr/include/
mkdir /usr/include/mysql
ln -sf /var/lib/mysql/include/* /usr/include/mysql/
ln -sf /var/lib/mysql/lib/* /usr/lib/
mkdir /usr/lib/mysql
ln -sf /var/lib/mysql/lib/* /usr/lib/mysql
./configure LDFLAGS=-L/opt/mysql/lib
make
cd src/
cp ndomod-4x.o ndo2db-4x log2ndo file2sock /usr/ local /nagios/bin
cd ../db
mysqladmin -u root -proot create nagios
./installdb -u root -p root -d nagios -h localhost
cd ../config
cp ndo2db.cfg-sample /usr/ local /nagios/etc/ndo2db.cfg
cp ndomod.cfg-sample /usr/ local /nagios/etc/ndomod.cfg
chown nagios:nagcmd /usr/ local /nagios/etc/ndo2db.cfg
chown nagios:nagcmd /usr/ local /nagios/etc/ndomod.cfg
chmod 744 /usr/ local /nagios/etc/ndo2db.cfg
chmod 744 /usr/ local /nagios/etc/ndomod.cfg
nano /usr/ local /nagios/etc/ndo2db.cfg
nano /usr/ local /nagios/etc/nagios.cfg
broker_module=/usr/ local /nagios/bin/ndomod-4x.o config_file=/usr/ local /nagios/etc/ndomod.cfg
/usr/ local /nagios/bin/ndo2db-4x -c /usr/ local /nagios/etc/ndo2db.cfg
tail -5 /var/ log /messages
Nov 7 00:57:46 localhost nagios: wproc: Core Worker 18303: job 34 (pid=25076): Dormant child reaped
Nov 7 17:01:39 localhost kernel: hpet1: lost 1 rtc interrupts
Nov 7 17:01:39 localhost nagios: Warning: A system time change of 57588 seconds (0d 15h 59m 48s forwards in time ) has been detected. Compensating...
Nov 7 17:01:58 localhost nagios: wproc: Core Worker 18302: job 41 (pid=25177) timed out. Killing it
Nov 7 17:01:58 localhost nagios: wproc: Core Worker 18302: job 38322480 with pid 25177 reaped at timeout. timeouts=2; started=42
|
2、ndoutils配置
2.1 mysql在本地机的配置
注:由于本地机也可以看作ip为127.0.0.1的远程机器,所以,配置方式也可以参考下方远程机的配置
编辑ndo2db.cfg配置文件
nano /usr/local/nagios/etc/ndo2db.cfg
lock_file=/usr/ local /nagios/var/ndo2db. lock
ndo2db_user=nagios
ndo2db_group=nagcmd
socket_type=unix
socket_name=/usr/ local /nagios/var/ndo.sock
tcp_port=5668
use_ssl=0
db_servertype=mysql
db_host=localhost
db_port=3306
db_name=nagios
db_prefix=nagios_
db_user=nagios
db_pass=root
max_timedevents_age=1440
max_systemcommands_age=10080
max_servicechecks_age=10080
max_hostchecks_age=10080
max_eventhandlers_age=44640
max_externalcommands_age=44640
debug_level=0
debug_verbosity=1
debug_file=/usr/ local /nagios/var/ndo2db.debug
max_debug_file_size=1000000
|
编辑ndomod.cfg配置文件
nano /usr/local/nagios/etc/ndomod.cfg
instance_name=default
output_type=unixsocket
output=127.0.0.1
tcp_port=5668
use_ssl=0
output_buffer_items=5000
buffer_file=/usr/ local /nagios/var/ndomod.tmp
file_rotation_interval=14400
file_rotation_timeout=60
reconnect_interval=15
reconnect_warning_interval=15
data_processing_options=-1
config_output_options=2
|
2.2 mysql在远程机的配置
编辑ndo2db.cfg配置文件
nano /usr/local/nagios/etc/ndo2db.cfg
lock_file=/usr/ local /nagios/var/ndo2db. lock
ndo2db_user=nagios
ndo2db_group=nagcmd
要核对清楚否则后面会导致一些权限问题
socket_type=tcp
socket_name=/usr/ local /nagios/var/ndo.sock
tcp_port=5668
use_ssl=0
db_servertype=mysql
db_host=localhost
db_port=3306
db_name=nagios
db_prefix=nagios_
db_user=nagios
db_pass=root
max_timedevents_age=1440
max_systemcommands_age=10080
max_servicechecks_age=10080
max_hostchecks_age=10080
max_eventhandlers_age=44640
max_externalcommands_age=44640
debug_level=0
debug_verbosity=1
debug_file=/usr/ local /nagios/var/ndo2db.debug
max_debug_file_size=1000000
|
编辑ndomod.cfg配置文件
nano /usr/local/nagios/etc/ndomod.cfg
instance_name=default
output_type=tcpsocket
output=127.0.0.1
tcp_port=5668
use_ssl=0
output_buffer_items=5000
buffer_file=/usr/ local /nagios/var/ndomod.tmp
file_rotation_interval=14400
file_rotation_timeout=60
reconnect_interval=15
reconnect_warning_interval=15
data_processing_options=-1
config_output_options=2
|
二、常见问题
1、不能启动服务
错误信息:Could not bind socket: Address already in use
解决方法:
1)删除 /usr/local/nagios/var/ 中的 ndo.sock
rm -rf /usr/local/nagios/var/ndo.sock
2)重启ndo服务
/usr/local/nagios/bin/ndo2db-4x -c /usr/local/nagios/etc/ndo2db.cfg
(一般删除了就没问题了,如果不行,查看端口5668,如下)
查看5668端口监听的进程信息,kill进程,步骤如下
[root @localhost etc]# netstat -apn |grep 5668
tcp 0 0 0.0 . 0.0 : 5668 0.0 . 0.0 :* LISTEN 1533 /ndo2db
[root @localhost etc]# kill - 3 1533 #这里的 1533 是删除进程对应的id
[root @localhost etc]# /usr/local/nagios/bin/ndo2db-4x -c /usr/local/nagios/etc/ndo2db.cfg #再次启动ndo进程
|
2、不能将数据存入数据库
按上述步骤启动运行,没有错误.在存储监控信息时,数据库里没有存进数据.
错误日志1:
错误信息:Could not open data sink!..
查看系统日志,内有"Could not open data sink!..."错误提示.
系统日志如下:
tail -20 /usr/ local /nagios/var/nagios. log
Nov 21 04:52:02 localhost nagios: ndomod: NDOMOD 2.0.0 (02-28-2014) Copyright (c) 2009 Nagios Core Development Team and Community Contributors
Nov 21 04:52:02 localhost nagios: ndomod: Could not open data sink! I'll keep trying, but some output may get lost...
Nov 21 04:52:02 localhost nagios: ndomod registered for contact data'
Nov 21 04:52:02 localhost nagios: ndomod registered for contact notification data'
Nov 21 04:52:02 localhost nagios: Event broker module '/usr/local/nagios/bin/ndomod-4x.o' initialized successfully.
Nov 21 04:52:02 localhost nagios: Successfully launched command file worker with pid 18576
|
原因是ndo2db.cfg 、ndomod.cfg 的属主和属组不正确,导致ndo2db没有读写权限
解决方法:
修改属组、属主(或直接修改文件权限)后即可解决.修改步骤如下:
cd /usr/ local /nagios/etc/
ls -l
-rw-------. 1 root root 4825 Nov 21 04:51 ndo2db.cfg
-rw-------. 1 root root 5104 Nov 21 04:38 ndomod.cfg
|
#修改属组、属主(或直接修改文件权限)后即可解决
chown nagios ndomod.cfg
chown nagios ndo2db.cfg
chmod 744 ndo2db.cfg
chmod 744 ndomod.cfg
|
错误日志2:
系统运行一段时间后就无法继续将监控信息存入数据库,
错误信息:Still unable to connect to data sink,queue send error, retrying...
查看nagios及系统日志信息,报如下错误:
[root@localhost etc]# tail -50 /usr/local/nagios/var/nagios.log
[1416980768] ndomod: Still unable to connect to data sink. 28787 items lost, 5000 queued items to flush.
[root@localhost etc]# tail /var/log/messages
Nov 25 22:00:02 localhost ndo2db-4x: Message sent to queue.
Nov 25 22:00:02 localhost ndo2db-4x: Warning: queue send error, retrying...
解决方法:
修改内核参数文件:nano /etc/sysctl.conf将下列参数改大点,如:
kernel.msgmax = 131072000 (每个消息的最大size)
kernel.msgmnb = 131072000 (整个系统的最大数量的消息队列)
kernel.msgmni = 65536000 ((每个消息队列的最大字节限制,一个队列有多个消息)
错误日志3:
Error writing to data sink! Some output may get lost.
三、数据库自动清理设置
为了防止数据库数据量过于庞大,ndo会定期对数据库进行清理(这个在配置文件中设置参数的地方有详细说明),如果在自己项目中希望长期保留数据库中的监控信息,可以对ndo配置文件中相应的参数进行修改,这里我暂且使用默认设置
编辑配置文件ndo2db.cfg:
nano /usr/ local /nagios/etc/ndo2db.cfg
|
在TABLE TRIMMING OPTIONS下方有一些参数:
max_timedevents_age=1440
max_systemcommands_age=10080
max_servicechecks_age=10080
max_hostchecks_age=10080
max_eventhandlers_age=44640
max_externalcommands_age=44640
max_notifications_age=44640
max_contactnotifications=44640
max_contactnotificationmethods=44640
max_logentries_age=129600
max_acknowledgements_age=44640
|
参数单位:秒