Kafka使用jmxtrans+influxdb+grafana监控JMX指标
最近在搞kafka集群监控,之前也是看了网上的很多资料。之所以使用jmxtrans+influxdb+grafana是因为界面酷炫,可以定制化,缺点是不能操作kafka集群,可能需要配合kafka manager一起使用。
环境信息
centos linux release 7.6.1810 (core)
jdk1.8.0_201
zookeeper-3.4.14
kafka_2.11-2.2.0
开启kafka jmx端口
jmx(java management extensions,即java管理扩展)是一个为应用程序、设备、系统等植入管理功能的框架。jmx可以跨越一系列异构操作系统平台、系统体系结构和网络传输协议,灵活的开发无缝集成的系统、网络和服务管理应用。kafka做为一款java应用,已经定义了丰富的性能指标,(可以参考kafka监控指标),通过jmx可以轻松对其进行监控。
在${kafka_home}/bin/路径下修改kafka-run-class.sh
脚本,第一行增加jmx_port=9999
即可。
jmx_port=9999
重启kafka
./bin/kafka-server-stop.sh ./bin/kafka-server-start.sh -daemon ./config/server.properties
重启后查看kafka以及jmx端口状态
ps -ef | grep kafka root 8273 1 99 02:32 pts/0 00:00:09 /opt/jdk1.8.0_201/bin/java -xmx1g -xms1g -server -xx:+useg1gc -xx:maxgcpausemillis=20 ...... kafka.kafka ./config/server.properties netstat -anop | grep 9999 tcp6 0 0 :::9999 :::* listen 8273/java off (0.00/0/0)
安装influxdb
influxdb是一个时间序列数据库,用于处理海量写入与负载查询。influxdb旨在用作涉及大量时间戳数据的任何用例(包括devops监控,应用程序指标,物联网传感器数据和实时分析)的后端存储。
下载influxdb rpm安装包
wget https://dl.influxdata.com/influxdb/releases/influxdb-1.7.5.x86_64.rpm --2019-04-10 02:52:30-- https://dl.influxdata.com/influxdb/releases/influxdb-1.7.5.x86_64.rpm resolving dl.influxdata.com (dl.influxdata.com)... 54.192.151.21, 54.192.151.81, 54.192.151.87, ... connecting to dl.influxdata.com (dl.influxdata.com)|54.192.151.21|:443... connected. http request sent, awaiting response... 200 ok length: 46536692 (44m) [application/octet-stream] saving to: ‘influxdb-1.7.5.x86_64.rpm’ 100%[================================================================================================================================================================================>] 46,536,692 440kb/s in 60s 2019-04-10 02:53:37 (756 kb/s) - ‘influxdb-1.7.5.x86_64.rpm’ saved [46536692/46536692]
安装rpm包
rpm -ivh influxdb-1.7.5.x86_64.rpm preparing... ################################# [100%] updating / installing... 1:influxdb-1.7.5-1 ################################# [100%] created symlink from /etc/systemd/system/influxd.service to /usr/lib/systemd/system/influxdb.service. created symlink from /etc/systemd/system/multi-user.target.wants/influxdb.service to /usr/lib/systemd/system/influxdb.service.
启动influxdb
service influxdb start redirecting to /bin/systemctl start influxdb.service
查看influxdb状态
ps -ef | grep influxdb influxdb 8475 1 2 03:01 ? 00:00:00 /usr/bin/influxd -config /etc/influxdb/influxdb.conf root 8486 7007 0 03:02 pts/0 00:00:00 grep --color=auto influxdb service influxdb status redirecting to /bin/systemctl status influxdb.service ● influxdb.service - influxdb is an open-source, distributed, time series database loaded: loaded (/usr/lib/systemd/system/influxdb.service; enabled; vendor preset: disabled) active: active (running) since wed 2019-04-10 03:01:48 edt; 22s ago docs: https://docs.influxdata.com/influxdb/ main pid: 8475 (influxd) cgroup: /system.slice/influxdb.service └─8475 /usr/bin/influxd -config /etc/influxdb/influxdb.conf apr 10 03:01:48 node1 influxd[8475]: ts=2019-04-10t07:01:48.375804z lvl=info msg="starting precreation service" log_id=0eiwgwrl000 service=shard-precreation check_interval=10m advance_period=30m apr 10 03:01:48 node1 influxd[8475]: ts=2019-04-10t07:01:48.375810z lvl=info msg="starting snapshot service" log_id=0eiwgwrl000 service=snapshot apr 10 03:01:48 node1 influxd[8475]: ts=2019-04-10t07:01:48.375816z lvl=info msg="starting continuous query service" log_id=0eiwgwrl000 service=continuous_querier apr 10 03:01:48 node1 influxd[8475]: ts=2019-04-10t07:01:48.375826z lvl=info msg="starting http service" log_id=0eiwgwrl000 service=httpd authentication=false apr 10 03:01:48 node1 influxd[8475]: ts=2019-04-10t07:01:48.375830z lvl=info msg="opened http access log" log_id=0eiwgwrl000 service=httpd path=stderr apr 10 03:01:48 node1 influxd[8475]: ts=2019-04-10t07:01:48.375936z lvl=info msg="listening on http" log_id=0eiwgwrl000 service=httpd addr=[::]:8086 https=false apr 10 03:01:48 node1 influxd[8475]: ts=2019-04-10t07:01:48.375949z lvl=info msg="starting retention policy enforcement service" log_id=0eiwgwrl000 service=retention check_interval=30m apr 10 03:01:48 node1 influxd[8475]: ts=2019-04-10t07:01:48.376138z lvl=info msg="listening for signals" log_id=0eiwgwrl000 apr 10 03:01:48 node1 influxd[8475]: ts=2019-04-10t07:01:48.376389z lvl=info msg="storing statistics" log_id=0eiwgwrl000 service=monitor db_instance=_internal db_rp=monitor interval=10s apr 10 03:01:48 node1 influxd[8475]: ts=2019-04-10t07:01:48.376534z lvl=info msg="sending usage statistics to usage.influxdata.com" log_id=0eiwgwrl000
使用influxdb客户端
influx connected to http://localhost:8086 version 1.7.5 influxdb shell version: 1.7.5 enter an influxql query >
创建用户和数据库
> create user "admin" with password 'admin' with all privileges
> create database "jmxdb"
创建完成influxdb的用户和数据库暂时就够用了,其它简单操作如下,后面会用到
#创建数据库 create database "db_name" #显示所有的数据库 show databases #删除数据库 drop database "db_name" #使用数据库 use db_name #显示该数据库中所有的表 show measurements #创建表,直接在插入数据的时候指定表名 insert test,host=127.0.0.1,monitor_name=test count=1 #删除表 drop measurement "measurement_name" #退出 quit
安装jmxtrans
jmxtrans的作用是自动去jvm中获取所有jmx格式数据,并按照某种格式(json文件配置格式)输出到其他应用程序(本例中的influxdb)。
下载jmxtrans rpm安装包
wget http://central.maven.org/maven2/org/jmxtrans/jmxtrans/270/jmxtrans-270.rpm --2019-04-10 03:18:14-- http://central.maven.org/maven2/org/jmxtrans/jmxtrans/270/jmxtrans-270.rpm resolving central.maven.org (central.maven.org)... 151.101.40.209 connecting to central.maven.org (central.maven.org)|151.101.40.209|:80... connected. http request sent, awaiting response... 200 ok length: 18750744 (18m) [application/x-rpm] saving to: ‘jmxtrans-270.rpm’ 100%[================================================================================================================================================================================>] 18,750,744 342kb/s in 43s
2019-04-10 03:18:59 (422 kb/s) - ‘jmxtrans-270.rpm’ saved [18750744/18750744]
安装rpm包
rpm -ivh jmxtrans-270.rpm
preparing... ################################# [100%]
updating / installing...
1:jmxtrans-270-1 ################################# [100%]
jmxtrans相关路径
jmxtrans安装目录:/usr/share/jmxtrans json文件默认目录:/var/lib/jmxtrans/ 日志路径:/var/log/jmxtrans/jmxtrans.log
配置json,jmxtrans的github上有一段示例配置
{ "servers" : [ { "port" : "1099", "host" : "w2", "queries" : [ { "obj" : "java.lang:type=memory", "attr" : [ "heapmemoryusage", "nonheapmemoryusage" ], "resultalias":"jvmmemory", "outputwriters" : [ { "@class" : "com.googlecode.jmxtrans.model.output.influxdbwriterfactory", "url" : "http://127.0.0.1:8086/", "username" : "admin", "password" : "admin", "database" : "jmxdb", "tags" : {"application" : "kafka"} } ] } ] } ] }
host:监控服务器
port:jmx端口
obj:对应jmx的objectname,就是我们要监控的指标
attr:对应objectname的属性,可以理解为我们要监控的指标的值
resultalias:对应metric 的名称,在influxdb里面就是measurements名
tags:对应influxdb的tag功能,对与存储在同一个measurements里面的不同监控指标可以做区分,我们在用grafana绘图的时候会用到,建议对每个监控指标都打上tags
启动jmxtrans
service jmxtrans start starting jmxtrans...
查看日志没有报错即为成功
tail /var/log/jmxtrans/jmxtrans.log info | jvm 1 | 2019/04/10 04:44:31 | using thread pool 'org.quartz.simpl.simplethreadpool' - with 10 threads. info | jvm 1 | 2019/04/10 04:44:31 | using job-store 'org.quartz.simpl.ramjobstore' - which does not support persistence. and is not clustered. info | jvm 1 | 2019/04/10 04:44:31 | info | jvm 1 | 2019/04/10 04:44:31 | 2019-04-10 04:44:31 [wrappersimpleappmain] info org.quartz.impl.stdschedulerfactory - quartz scheduler 'serverscheduler' initialized from an externally opened inputstream. info | jvm 1 | 2019/04/10 04:44:31 | 2019-04-10 04:44:31 [wrappersimpleappmain] info org.quartz.impl.stdschedulerfactory - quartz scheduler version: 1.8.6 info | jvm 1 | 2019/04/10 04:44:31 | 2019-04-10 04:44:31 [wrappersimpleappmain] info org.quartz.core.quartzscheduler - jobfactory set to: com.googlecode.jmxtrans.guice.guicejobfactory@23822296 2019-04-10 04:44:31 [wrappersimpleappmain] level com.googlecode.jmxtrans.jmxtransformer [jmxtransformer.java:177] - starting jmxtrans on : /var/lib/jmxtrans 2019-04-10 04:44:31 [wrappersimpleappmain] level org.quartz.core.quartzscheduler [quartzscheduler.java:519] - scheduler serverscheduler_$_node11554885871753 started. info | jvm 1 | 2019/04/10 04:44:31 | 2019-04-10 04:44:31 [wrappersimpleappmain] info c.googlecode.jmxtrans.jmxtransformer - starting jmxtrans on : /var/lib/jmxtrans info | jvm 1 | 2019/04/10 04:44:31 | 2019-04-10 04:44:31 [wrappersimpleappmain] info org.quartz.core.quartzscheduler - scheduler serverscheduler_$_node11554885871753 started.
附上两段通用的json文件
base_127.0.0.1.json
{ "servers": [{ "port": "9999", "host": "127.0.0.1", "queries": [{ "obj": "kafka.server:type=brokertopicmetrics,name=bytesinpersec", "attr": ["count", "oneminuterate"], "resultalias": "bytesinpersec", "outputwriters": [{ "@class": "com.googlecode.jmxtrans.model.output.influxdbwriterfactory", "url": "http://127.0.0.1:8086/", "username": "admin", "password": "admin", "database": "jmxdb", "tags": { "application": "bytesinpersec" } }] }, { "obj": "kafka.server:type=brokertopicmetrics,name=bytesoutpersec", "attr": ["count", "oneminuterate"], "resultalias": "bytesoutpersec", "outputwriters": [{ "@class": "com.googlecode.jmxtrans.model.output.influxdbwriterfactory", "url": "http://127.0.0.1:8086/", "username": "admin", "password": "admin", "database": "jmxdb", "tags": { "application": "bytesoutpersec" } }] }, { "obj": "kafka.server:type=brokertopicmetrics,name=bytesrejectedpersec", "attr": ["count", "oneminuterate"], "resultalias": "bytesrejectedpersec", "outputwriters": [{ "@class": "com.googlecode.jmxtrans.model.output.influxdbwriterfactory", "url": "http://127.0.0.1:8086/", "username": "admin", "password": "admin", "database": "jmxdb", "tags": { "application": "bytesrejectedpersec" } }] }, { "obj": "kafka.server:type=brokertopicmetrics,name=messagesinpersec", "attr": ["count", "oneminuterate"], "resultalias": "messagesinpersec", "outputwriters": [{ "@class": "com.googlecode.jmxtrans.model.output.influxdbwriterfactory", "url": "http://127.0.0.1:8086/", "username": "admin", "password": "admin", "database": "jmxdb", "tags": { "application": "messagesinpersec" } }] }, { "obj": "kafka.network:type=requestmetrics,name=requestspersec,request=fetchconsumer", "attr": ["count"], "resultalias": "requestspersec", "outputwriters": [{ "@class": "com.googlecode.jmxtrans.model.output.influxdbwriterfactory", "url": "http://127.0.0.1:8086/", "username": "admin", "password": "admin", "database": "jmxdb", "tags": { "request": "fetchconsumer" } }] }, { "obj": "kafka.network:type=requestmetrics,name=requestspersec,request=fetchfollower", "attr": ["count"], "resultalias": "requestspersec", "outputwriters": [{ "@class": "com.googlecode.jmxtrans.model.output.influxdbwriterfactory", "url": "http://127.0.0.1:8086/", "username": "admin", "password": "admin", "database": "jmxdb", "tags": { "request": "fetchfollower" } }] }, { "obj": "kafka.network:type=requestmetrics,name=requestspersec,request=produce", "attr": ["count"], "resultalias": "requestspersec", "outputwriters": [{ "@class": "com.googlecode.jmxtrans.model.output.influxdbwriterfactory", "url": "http://127.0.0.1:8086/", "username": "admin", "password": "admin", "database": "jmxdb", "tags": { "request": "produce" } }] }, { "obj": "java.lang:type=memory", "attr": ["heapmemoryusage", "nonheapmemoryusage"], "resultalias": "memoryusage", "outputwriters": [{ "@class": "com.googlecode.jmxtrans.model.output.influxdbwriterfactory", "url": "http://127.0.0.1:8086/", "username": "admin", "password": "admin", "database": "jmxdb", "tags": { "application": "memoryusage" } }] }, { "obj": "java.lang:type=garbagecollector,name=*", "attr": ["collectioncount", "collectiontime"], "resultalias": "gc", "outputwriters": [{ "@class": "com.googlecode.jmxtrans.model.output.influxdbwriterfactory", "url": "http://127.0.0.1:8086/", "username": "admin", "password": "admin", "database": "jmxdb", "tags": { "application": "gc" } }] }, { "obj": "java.lang:type=threading", "attr": ["peakthreadcount", "threadcount"], "resultalias": "thread", "outputwriters": [{ "@class": "com.googlecode.jmxtrans.model.output.influxdbwriterfactory", "url": "http://127.0.0.1:8086/", "username": "admin", "password": "admin", "database": "jmxdb", "tags": { "application": "thread" } }] }, { "obj": "kafka.server:type=replicafetchermanager,name=maxlag,clientid=replica", "attr": ["value"], "resultalias": "replicafetchermanager", "outputwriters": [{ "@class": "com.googlecode.jmxtrans.model.output.influxdbwriterfactory", "url": "http://127.0.0.1:8086/", "username": "admin", "password": "admin", "database": "jmxdb", "tags": { "application": "maxlag" } }] }, { "obj": "kafka.server:type=replicamanager,name=partitioncount", "attr": ["value"], "resultalias": "replicamanager", "outputwriters": [{ "@class": "com.googlecode.jmxtrans.model.output.influxdbwriterfactory", "url": "http://127.0.0.1:8086/", "username": "admin", "password": "admin", "database": "jmxdb", "tags": { "application": "partitioncount" } }] }, { "obj": "kafka.server:type=replicamanager,name=underreplicatedpartitions", "attr": ["value"], "resultalias": "replicamanager", "outputwriters": [{ "@class": "com.googlecode.jmxtrans.model.output.influxdbwriterfactory", "url": "http://127.0.0.1:8086/", "username": "admin", "password": "admin", "database": "jmxdb", "tags": { "application": "underreplicatedpartitions" } }] }, { "obj": "kafka.server:type=replicamanager,name=leadercount", "attr": ["value"], "resultalias": "replicamanager", "outputwriters": [{ "@class": "com.googlecode.jmxtrans.model.output.influxdbwriterfactory", "url": "http://127.0.0.1:8086/", "username": "admin", "password": "admin", "database": "jmxdb", "tags": { "application": "leadercount" } }] }, { "obj": "kafka.network:type=requestmetrics,name=totaltimems,request=fetchconsumer", "attr": ["count", "max"], "resultalias": "totaltimems", "outputwriters": [{ "@class": "com.googlecode.jmxtrans.model.output.influxdbwriterfactory", "url": "http://127.0.0.1:8086/", "username": "admin", "password": "admin", "database": "jmxdb", "tags": { "application": "fetchconsumer" } }] }, { "obj": "kafka.network:type=requestmetrics,name=totaltimems,request=fetchfollower", "attr": ["count", "max"], "resultalias": "totaltimems", "outputwriters": [{ "@class": "com.googlecode.jmxtrans.model.output.influxdbwriterfactory", "url": "http://127.0.0.1:8086/", "username": "admin", "password": "admin", "database": "jmxdb", "tags": { "application": "fetchfollower" } }] }, { "obj": "kafka.network:type=requestmetrics,name=totaltimems,request=produce", "attr": ["count", "max"], "resultalias": "totaltimems", "outputwriters": [{ "@class": "com.googlecode.jmxtrans.model.output.influxdbwriterfactory", "url": "http://127.0.0.1:8086/", "username": "admin", "password": "admin", "database": "jmxdb", "tags": { "application": "produce" } }] }, { "obj": "kafka.server:type=replicamanager,name=isrshrinkspersec", "attr": ["count"], "resultalias": "replicamanager", "outputwriters": [{ "@class": "com.googlecode.jmxtrans.model.output.influxdbwriterfactory", "url": "http://127.0.0.1:8086/", "username": "admin", "password": "admin", "database": "jmxdb", "tags": { "application": "isrshrinkspersec" } }] }] }] }
topica_1.json
{ "servers": [{ "port": "9999", "host": "127.0.0.1", "queries": [{ "obj": "kafka.server:type=brokertopicmetrics,name=bytesinpersec,topic=topica", "attr": ["count"], "resultalias": "topica", "outputwriters": [{ "@class": "com.googlecode.jmxtrans.model.output.influxdbwriterfactory", "url": "http://127.0.0.1:8086/", "username": "admin", "password": "admin", "database": "jmxdb", "tags": { "application": "bytesinpersec" } }] }, { "obj": "kafka.server:type=brokertopicmetrics,name=bytesoutpersec,topic=topica", "attr": ["count"], "resultalias": "topica", "outputwriters": [{ "@class": "com.googlecode.jmxtrans.model.output.influxdbwriterfactory", "url": "http://127.0.0.1:8086/", "username": "admin", "password": "admin", "database": "jmxdb", "tags": { "application": "bytesoutpersec" } }] }, { "obj": "kafka.server:type=brokertopicmetrics,name=messagesinpersec,topic=topica", "attr": ["count"], "resultalias": "topica", "outputwriters": [{ "@class": "com.googlecode.jmxtrans.model.output.influxdbwriterfactory", "url": "http://127.0.0.1:8086/", "username": "admin", "password": "admin", "database": "jmxdb", "tags": { "application": "messagesinpersec" } }] }, { "obj": "kafka.log:type=log,name=logendoffset,topic=topica,partition=*", "attr": ["value"], "resultalias": "topica", "outputwriters": [{ "@class": "com.googlecode.jmxtrans.model.output.influxdbwriterfactory", "url": "http://127.0.0.1:8086/", "username": "admin", "password": "admin", "database": "jmxdb", "tags": { "application": "logendoffset" } }] }] }] }
安装grafana
grafana是一个跨平台的开源的度量分析和可视化工具,可以通过将采集的数据查询然后可视化的展示,并及时通知。
下载jmxtrans rpm安装包
wget https://s3-us-west-2.amazonaws.com/grafana-releases/release/grafana-6.0.2-1.x86_64.rpm --2019-04-10 04:53:15-- https://s3-us-west-2.amazonaws.com/grafana-releases/release/grafana-6.0.2-1.x86_64.rpm resolving s3-us-west-2.amazonaws.com (s3-us-west-2.amazonaws.com)... 52.218.144.92 connecting to s3-us-west-2.amazonaws.com (s3-us-west-2.amazonaws.com)|52.218.144.92|:443... connected. http request sent, awaiting response... 200 ok length: 56002012 (53m) [application/x-redhat-package-manager] saving to: ‘grafana-6.0.2-1.x86_64.rpm’ 100%[================================================================================================================================================================================>] 56,002,012 177kb/s in 2m 52s
2019-04-10 04:56:08 (318 kb/s) - ‘grafana-6.0.2-1.x86_64.rpm’ saved [56002012/56002012]
安装rpm包
rpm -ivh grafana-6.0.2-1.x86_64.rpm warning: grafana-6.0.2-1.x86_64.rpm: header v4 rsa/sha1 signature, key id 24098cb6: nokey error: failed dependencies: fontconfig is needed by grafana-6.0.2-1.x86_64 urw-fonts is needed by grafana-6.0.2-1.x86_64
缺少依赖,下载依赖
yum install --downloadonly --downloaddir=./ fontconfig yum localinstall fontconfig-2.13.0-4.3.el7.x86_64.rpm yum install --downloadonly --downloaddir=./ urw-fonts yum localinstall urw-fonts-2.4-16.el7.noarch.rpm rpm -ivh grafana-6.0.2-1.x86_64.rpm warning: grafana-6.0.2-1.x86_64.rpm: header v4 rsa/sha1 signature, key id 24098cb6: nokey preparing... ################################# [100%] updating / installing... 1:grafana-6.0.2-1 ################################# [100%] ### not starting on installation, please execute the following statements to configure grafana to start automatically using systemd sudo /bin/systemctl daemon-reload sudo /bin/systemctl enable grafana-server.service ### you can start grafana-server by executing sudo /bin/systemctl start grafana-server.service posttrans: running script
启动grafana
service grafana-server start starting grafana-server (via systemctl): [ ok ]
打开浏览器
http://127.0.0.1:3000
先输入默认用户名密码admin/admin
设置新密码
点击add data source
选择influxdb
输入连接信息后点击save & test
通过后点击back返回
左侧 + 可以创建或引入仪表盘
类似于数据库sql语句,查询相应的指标
计算平均每秒数值可以使用如上语法,用当前值减1分钟之前的值再除以60
具体展示效果就看各位的审美能力,这里就不贴出来了。至此,kafka的jmx指标监控就完成了。
上一篇: 索尼发布PS5游戏机
推荐阅读
-
Kafka使用jmxtrans+influxdb+grafana监控JMX指标
-
Windows下使用性能监视器监控SqlServer的常见指标
-
监控tomcat日志:flume+kafka使用介绍
-
监控应用服务器---使用JMX监控Tomcat
-
监控应用服务器---使用JMX监控Tomcat
-
使用JMX监控Tomcat tomcatjdkthreadjava应用服务器
-
jvisualvm下使用JMX方式远程监控tomcat7
-
分布式监控系统之Zabbix 使用SNMP、JMX信道采集数据的原理解析
-
监控tomcat日志:flume+kafka使用介绍
-
Kafka使用jmxtrans+influxdb+grafana监控JMX指标