java性能分析 - CPU飙高分析工具
程序员文章站
2022-04-18 13:19:10
背景 有处理过生产问题的同学基本都能遇到系统忽然缓慢,CPU突然飙升,甚至整个应用请求不可用。当出现这种情况下,在不影响数据准确性的前提下,我们应该尽快导出jstack和内存信息,然后重启系统,尽快回复系统的可用性,避免用户体验过差。本文针对CPU飙升问题,提供该问题的排查思路,从而能够快速定位到某 ......
背景
有处理过生产问题的同学基本都能遇到系统忽然缓慢,cpu突然飙升,甚至整个应用请求不可用。当出现这种情况下,在不影响数据准确性的前提下,我们应该尽快导出jstack和内存信息,然后重启系统,尽快回复系统的可用性,避免用户体验过差。本文针对cpu飙升问题,提供该问题的排查思路,从而能够快速定位到某线程甚至某快代码导致cpu飙升,从而提供处理该问题的思路。
排查过程
- 通过top命令查看cpu飙升的java进程pid
- 通过ps -mp [pid] -o thread,tid,time查看该进程下所拥有的线程及各个线程占用cpu的使用率,并且记录cpu使用率过高的线程id号
- 将线程id号转换为16进程的数值记为tid_hex
- 使用jdk自带jstack监控命令
- 使用命令jstack [pid] | grep tid_hex -a100命令输出该线程的堆栈信息
- 根据堆栈信息分析代码。
通过以上步骤可以查找出导致cpu飙升的相关代码位置,然后对代码进行code review即可。
工具封装
- 以上步骤已经封装为脚本文件,通过以下脚本文件只需要指定进程id即pid即可导出默认前5条导致cpu率过高的堆栈信息。
- 已上传github :
./java-thread-top.sh -p pid
#!/bin/bash # @function # find out the highest cpu consumed threads of java processes, and print the stack of these threads. # @github https://github.com/cjunn/script_tool/ # @author cjunn # @date sun jan 12 2020 21:08:58 gmt+0800 # pid=''; count=5; function usage(){ readonly prog="`basename $0`" cat <<eof usage: ${prog} [option] find out the highest cpu consumed threads of java processes, and print the stack of these threads. example: ${prog} -p <pid> -c 5 # show top 5 busy java threads info output control: -p, --pid <java pid> find out the highest cpu consumed threads from the specified java process. default from all java process. -c, --count <num> set the thread count to show, default is 5. miscellaneous: -h, --help display this help and exit. eof } #1.collect script parameters #2.check whether pid exists if [ $# -gt 0 ]; then while true; do case "$1" in -c|--count) count="$2" shift 2 ;; -p|--pid) pid="$2" shift 2 ;; -h|--help) usage exit 0; ;; --) shift break ;; *) shift if [ -z "$1" ] ; then break fi ;; esac done fi if [ ! -n "$pid" ] ;then echo "error: -p is empty" exit 1; fi function worker(){ #1.query all threads according to pid. #2.delete header and first line information. #3.according to the second column of cpu to sort, reverse display. #4.delete the count + 1 to last column based on the count value. #5.get cpu utilization, tid value, thread used time, and assign them to cpu, tid, time respectively. #6.perform hex conversion on tid. #7.use jdk to monitor all threads of jstack output pid. #8.use awk to regularly query the thread information of tid_hex required. #9.display the stack information of count before thread busy. local whilec=0; ps -mp $pid -o thread,tid,time | sed '1,2d' | sort -k 2 -n -r |sed $[$count+1]',$d' | awk '{print $2,$8,$9}' | while read cpu tid time do tid_hex=$(printf "%x" $tid); echo "====================== tid:${tid} tid_hex:${tid_hex} cpu:${cpu} time:${time} ======================"; jstack $pid | awk 'begin {rs = "\n\n+";ors = "\n\n"} /'${tid_hex}'/ {print $0}' echo ""; whilec=$[$whilec+1]; done if [ $whilec -eq 0 ] ; then echo "error : thread not found, make sure pid exists."; fi } worker