arthas
程序员文章站
2022-07-15 15:38:43
...
在2019年度最受欢迎中国开源软件看到jvm调优神器——arthas
于是安装验证一下,也可参考官方文档Arthas 用户文档
在使用之前,先看一个Linux strace命令,你需要知道的16个Linux服务器监控命令
执行strace -ff -o ./bdo -p 23809
,可以查看到进程23809
中所有线程的内核执行过程,arthas中应该有一些调用linux的基本命令
recvfrom(402, "\0\0\0\2\0\0\0", 7, 0, NULL, NULL) = 7
ioctl(402, FIONREAD, [0]) = 0
ioctl(402, FIONREAD, [0]) = 0
sendto(402, "\36\0\0\0\3select @@session.tx_read_on"..., 34, 0, NULL, 0) = 34
ioctl(402, FIONREAD, [0]) = 0
recvfrom(402, "\1\0\0\1", 4, 0, NULL, NULL) = 4
ioctl(402, FIONREAD, [73]) = 0
recvfrom(402, "\1,\0\0\2\3def\0\0\0\[email protected]@session.tx_read_o"..., 73, 0, NULL, NULL) = 73
ioctl(402, FIONREAD, [0]) = 0
ioctl(402, FIONREAD, [0]) = 0
sendto(402, "#\0\0\0\3set session transaction rea"..., 39, 0, NULL, 0) = 39
ioctl(402, FIONREAD, [0]) = 0
recvfrom(402, "\7\0\0\1", 4, 0, NULL, NULL) = 4
ioctl(402, FIONREAD, [7]) = 0
recvfrom(402, "\0\0\0\2\0\0\0", 7, 0, NULL, NULL) = 7
ioctl(402, FIONREAD, [0]) = 0
write(330, "\1", 1) = 1
futex(0x7f97e95d7a24, FUTEX_WAIT_BITSET_PRIVATE, 59475, {57557167, 192371664}, ffffffff) = 0
futex(0x7f97e95d79f8, FUTEX_WAIT_PRIVATE, 2, NULL) = -1 EAGAIN (Resource temporarily unavailable)
futex(0x7f97e95d79f8, FUTEX_WAKE_PRIVATE, 1) = 0
write(1, "2021-03-16 12:27:14.395 INFO c."..., 93) = 93
write(308, "12:27:14.395 INFO c.b.f.o.s.han"..., 76) = 76
write(1, "2021-03-16 12:27:14.396 INFO c."..., 101) = 101
下面是arthas的基本引用
[[email protected] app]# curl -O https://alibaba.github.io/arthas/arthas-boot.jar
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 135k 100 135k 0 0 89227 0 0:00:01 0:00:01 --:--:-- 89269
[[email protected] app]# ll
total 140
-rw-r--r-- 1 root root 138993 Mar 16 11:23 arthas-boot.jar
drwxr-xr-x 7 10 143 4096 Jul 7 2018 jdk1.8.0_181
[[email protected] app]# java -jar arthas-boot.jar
[INFO] arthas-boot version: 3.4.5
[INFO] Found existing java process, please choose one and input the serial number of the process, eg : 1. Then hit ENTER.
* [1]: 5280 org.apache.ambari.server.controller.AmbariServer
[2]: 992 org.tanukisoftware.wrapper.WrapperSimpleApp
[3]: 9477 org.apache.hadoop.hdfs.server.namenode.NameNode
[4]: 25895 org.apache.livy.server.LivyServer
[5]: 28007 org.apache.spark.deploy.history.HistoryServer
[6]: 19755 org.apache.hadoop.yarn.server.nodemanager.NodeManager
[7]: 24046 org.apache.hadoop.util.RunJar
[8]: 12848 org.apache.hadoop.hbase.master.HMaster
[9]: 14704 org.apache.phoenix.queryserver.server.QueryServer
[10]: 26421 org.apache.hadoop.util.RunJar
[11]: 8822
[12]: 16505 org.apache.hadoop.hbase.thrift2.ThriftServer
[13]: 14303 org.apache.hadoop.hbase.regionserver.HRegionServer
########################## 因为ambari是python的程序,所以这里会报错
1
[INFO] Start download arthas from remote server: https://arthas.aliyun.com/download/3.5.0?mirror=aliyun
[INFO] Download arthas success.
[INFO] arthas home: /root/.arthas/lib/3.5.0/arthas
[INFO] Try to attach process 5280
[ERROR] Start arthas failed, exception stack trace:
com.sun.tools.attach.AttachNotSupportedException: Unable to open socket file: target process not responding or HotSpot VM not loaded
at sun.tools.attach.LinuxVirtualMachine.<init>(LinuxVirtualMachine.java:106)
at sun.tools.attach.LinuxAttachProvider.attachVirtualMachine(LinuxAttachProvider.java:78)
at com.sun.tools.attach.VirtualMachine.attach(VirtualMachine.java:250)
at com.taobao.arthas.core.Arthas.attachAgent(Arthas.java:99)
at com.taobao.arthas.core.Arthas.<init>(Arthas.java:26)
at com.taobao.arthas.core.Arthas.main(Arthas.java:137)
[ERROR] attach fail, targetPid: 5280
没找到socket file就提示下面的异常
[ERROR] Start arthas failed, exception stack trace:
com.sun.tools.attach.AttachNotSupportedException: Unable to open socket file: target process not responding or HotSpot VM not loaded
at sun.tools.attach.LinuxVirtualMachine.<init>(LinuxVirtualMachine.java:106)
at sun.tools.attach.LinuxAttachProvider.attachVirtualMachine(LinuxAttachProvider.java:78)
at com.sun.tools.attach.VirtualMachine.attach(VirtualMachine.java:250)
at com.taobao.arthas.core.Arthas.attachAgent(Arthas.java:99)
at com.taobao.arthas.core.Arthas.<init>(Arthas.java:26)
at com.taobao.arthas.core.Arthas.main(Arthas.java:137)
[ERROR] attach fail, targetPid: 28007
将arthas与进程bdo.jar
进行关联成功,进入到arthas的交互环境
[root@bwsc52 application]# java -jar arthas-boot.jar
[INFO] arthas-boot version: 3.4.5
[INFO] Found existing java process, please choose one and input the serial number of the process, eg : 1. Then hit ENTER.
* [1]: 23809 bdo.jar
1
[INFO] Start download arthas from remote server: https://arthas.aliyun.com/download/3.5.0?mirror=aliyun
[INFO] File size: 12.22 MB, downloaded size: 2.15 MB, downloading ...
[INFO] File size: 12.22 MB, downloaded size: 6.23 MB, downloading ...
[INFO] File size: 12.22 MB, downloaded size: 8.70 MB, downloading ...
[INFO] File size: 12.22 MB, downloaded size: 11.60 MB, downloading ...
[INFO] Download arthas success.
[INFO] Download arthas success.
[INFO] arthas home: /root/.arthas/lib/3.5.0/arthas
[INFO] Try to attach process 23809
[INFO] Attach process 23809 success.
[INFO] arthas-client connect 127.0.0.1 3658
,---. ,------. ,--------.,--. ,--. ,---. ,---.
/ O \ | .--. ''--. .--'| '--' | / O \ ' .-'
| .-. || '--'.' | | | .--. || .-. |`. `-.
| | | || |\ \ | | | | | || | | |.-' |
`--' `--'`--' '--' `--' `--' `--'`--' `--'`-----'
wiki https://arthas.aliyun.com/doc
tutorials https://arthas.aliyun.com/doc/arthas-tutorials.html
version 3.5.0
main_class
pid 23809
time 2021-03-16 11:35:42
[arthas@23809]$
查看cpu的使用thread
,从下图看是正常的
[[email protected]]$ thread
Threads Total: 870, NEW: 0, RUNNABLE: 58, BLOCKED: 0, WAITING: 755, TIMED_WAITING: 47, TERMINATED: 0, Internal threads: 10
ID NAME GROUP PRIORITY STATE %CPU DELTA_TIME TIME INTERRUPTED DAEMON
23826 arthas-command-execute system 5 RUNNABLE 24.08 0.239 0:0.488 false true
-1 C2 CompilerThread0 - -1 - 0.9 0.009 7:23.236 false true
63 redisson-netty-4-5 main 5 RUNNABLE 0.74 0.007 528:19.124 false false
-1 C1 CompilerThread2 - -1 - 0.46 0.004 4:24.470 false true
23 Thread-7 system 9 WAITING 0.45 0.004 3:15.760 false true
24 SimplePauseDetectorThread_0 system 9 TIMED_WAITIN 0.32 0.003 92:31.434 false true
58 pool-12-thread-1 main 5 TIMED_WAITIN 0.17 0.001 96:36.543 false false
59 redisson-netty-4-1 main 5 RUNNABLE 0.15 0.001 36:54.021 false false
60 redisson-netty-4-2 main 5 RUNNABLE 0.13 0.001 49:41.481 false false
25 elasticsearch[_client_][[timer]] main 5 TIMED_WAITIN 0.11 0.001 64:34.247 false true
64 redisson-netty-4-6 main 5 RUNNABLE 0.1 0.001 40:56.540 false false
40 elasticsearch[_client_][[timer]] main 5 TIMED_WAITIN 0.09 0.000 64:56.965 false true
35 elasticsearch[_client_][transport_cli main 5 RUNNABLE 0.07 0.000 68:55.662 false true
61 redisson-netty-4-3 main 5 RUNNABLE 0.07 0.000 40:9.429 false false
771 Abandoned connection cleanup thread main 5 TIMED_WAITIN 0.06 0.000 132:16.994 false true
867 SystemTimer main 1 TIMED_WAITIN 0.05 0.000 640:51.449 false true
863 pool-37-thread-1{Hashed wheel timer # main 5 TIMED_WAITIN 0.04 0.000 93:9.247 false false
824 SimplePauseDetectorThread_2 main 5 TIMED_WAITIN 0.04 0.000 465:31.486 false true
32 elasticsearch[_client_][transport_cli main 5 RUNNABLE 0.04 0.000 85:44.849 false true
-1 VM Periodic Task Thread - -1 - 0.03 0.000 143:42.067 false true
822 SimplePauseDetectorThread_0 main 5 TIMED_WAITIN 0.03 0.000 459:42.045 false true
65 redisson-netty-4-7 main 5 RUNNABLE 0.02 0.000 40:4.071 false false
36 elasticsearch[_client_][transport_cli main 5 RUNNABLE 0.02 0.000 69:52.316 false true
823 SimplePauseDetectorThread_1 main 5 TIMED_WAITIN 0.02 0.000 456:44.616 false true
62 redisson-netty-4-4 main 5 RUNNABLE 0.02 0.000 38:57.459 false false
42 elasticsearch[_client_][transport_cli main 5 RUNNABLE 0.02 0.000 108:58.594 false true
30 elasticsearch[_client_][transport_cli main 5 RUNNABLE 0.02 0.000 95:29.021 false true
28 elasticsearch[_client_][transport_cli main 5 RUNNABLE 0.01 0.000 76:42.378 false true
50 elasticsearch[_client_][transport_cli main 5 RUNNABLE 0.01 0.000 64:35.136 false true
34 elasticsearch[_client_][transport_cli main 5 RUNNABLE 0.01 0.000 73:10.340 false tr
查看线程是否存在死锁,这里没有出现这个情况
[[email protected]]$ thread -b
No most blocking thread found!
如果发生内存泄露,则调用dashboard
命令进行查看,下面没有看到BLOCKED
状态的线程,就可以不用管了。
dashboard
ID NAME GROUP PRIORITY STATE %CPU DELTA_TIME TIME INTERRUPTED DAEMON
867 SystemTimer main 1 TIMED_WAITIN 0.0 0.000 640:54.165 false true
-1 VM Thread - -1 - 0.0 0.000 588:37.784 false true
63 redisson-netty-4-5 main 5 RUNNABLE 0.0 0.000 528:25.859 false false
824 SimplePauseDetectorThread_2 main 5 TIMED_WAITIN 0.0 0.000 465:33.064 false true
822 SimplePauseDetectorThread_0 main 5 TIMED_WAITIN 0.0 0.000 459:43.597 false true
823 SimplePauseDetectorThread_1 main 5 TIMED_WAITIN 0.0 0.000 456:46.142 false true
826 lettuce-nioEventLoop-6-2 main 5 RUNNABLE 0.0 0.000 392:0.025 false true
-1 VM Periodic Task Thread - -1 - 0.0 0.000 143:43.694 false true
771 Abandoned connection cleanup thread main 5 TIMED_WAITIN 0.0 0.000 132:18.651 false true
42 elasticsearch[_client_][transport_cli main 5 RUNNABLE 0.0 0.000 109:0.027 false true
46 elasticsearch[_client_][transport_cli main 5 RUNNABLE 0.0 0.000 102:12.483 false true
58 pool-12-thread-1 main 5 TIMED_WAITIN 0.0 0.000 96:37.703 false false
44 elasticsearch[_client_][transport_cli main 5 RUNNABLE 0.0 0.000 96:15.539 false true
30 elasticsearch[_client_][transport_cli main 5 RUNNABLE 0.0 0.000 95:29.943 false true
863 pool-37-thread-1{Hashed wheel timer # main 5 TIMED_WAITIN 0.0 0.000 93:10.358 false false
836 pool-35-thread-1{Hashed wheel timer # main 5 TIMED_WAITIN 0.0 0.000 93:1.143 false false
24 SimplePauseDetectorThread_0 system 9 TIMED_WAITIN 0.0 0.000 92:32.502 false true
47 elasticsearch[_client_][transport_cli main 5 RUNNABLE 0.0 0.000 90:23.412 false true
31 elasticsearch[_client_][transport_cli main 5 RUNNABLE 0.0 0.000 89:10.260 false true
33 elasticsearch[_client_][transport_cli main 5 RUNNABLE 0.0 0.000 86:56.440 false true
32 elasticsearch[_client_][transport_cli main 5 RUNNABLE 0.0 0.000 85:45.674 false true
45 elasticsearch[_client_][transport_cli main 5 RUNNABLE 0.0 0.000 81:21.415 false true
28 elasticsearch[_client_][transport_cli main 5 RUNNABLE 0.0 0.000 76:43.069 false true
34 elasticsearch[_client_][transport_cli main 5 RUNNABLE 0.0 0.000 73:11.123 false true
48 elasticsearch[_client_][transport_cli main 5 RUNNABLE 0.0 0.000 70:32.459 false true
36 elasticsearch[_client_][transport_cli main 5 RUNNABLE 0.0 0.000 69:52.974 false true
35 elasticsearch[_client_][transport_cli main 5 RUNNABLE 0.0 0.000 68:56.311 false true
40 elasticsearch[_client_][[timer]] main 5 TIMED_WAITIN 0.0 0.000 64:57.801 false true
50 elasticsearch[_client_][transport_cli main 5 RUNNABLE 0.0 0.000 64:35.837 false true
25 elasticsearch[_client_][[timer]] main 5 TIMED_WAITIN 0.0 0.000 64:34.984 false true
49 elasticsearch[_client_][transport_cli main 5 RUNNABLE 0.0 0.000 64:11.541 false true
835 Avro NettyTransceiver I/O Worker 8{Ne main 5 RUNNABLE 0.0 0.000 58:36.468 false true
834 Avro NettyTransceiver I/O Worker 7{Ne main 5 RUNNABLE 0.0 0.000 50:9.839 false true
60 redisson-netty-4-2 main 5 RUNNABLE 0.0 0.000 49:42.133 false false
828 Avro NettyTransceiver I/O Worker 1{Ne main 5 RUNNABLE 0.0 0.000 48:18.746 false true
Memory used total max usage GC
heap 384M 783M 843M 45.65% gc.ps_scavenge.count 21462
ps_eden_space 77M 294M 300M 25.67% gc.ps_scavenge.time(ms) 2003799
ps_survivor_space 4M 4M 4M 91.06% gc.ps_marksweep.count 10
ps_old_gen 303M 485M 632M 48.03% gc.ps_marksweep.time(ms) 6320
nonheap 190M 242M -1 78.50%
code_cache 35M 77M 240M 14.61%
metaspace 138M 146M -1 94.52%
compressed_class_space 17M 18M 1024M 1.67%
direct 131M 131M - 100.00%
mapped 0K 0K - 0.00%
Runtime
os.name Linux
os.version 3.10.0-327.el7.x86_64
java.version 1.8.0_151
java.home /usr/java/jdk1.8.0_151/jre
systemload.average 0.28
processors 4
timestamp/uptime Tue Mar 16 11:46:58 CST 2021/9650054s
下载内存快照
[[email protected]]$ heapdump --live /root/jvm.hprof
Dumping heap to /root/jvm.hprof ...
Heap dump file created