欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  IT编程

java性能分析 - CPU飙高分析工具

程序员文章站 2022-04-18 13:19:10
背景 有处理过生产问题的同学基本都能遇到系统忽然缓慢,CPU突然飙升,甚至整个应用请求不可用。当出现这种情况下,在不影响数据准确性的前提下,我们应该尽快导出jstack和内存信息,然后重启系统,尽快回复系统的可用性,避免用户体验过差。本文针对CPU飙升问题,提供该问题的排查思路,从而能够快速定位到某 ......

背景

        有处理过生产问题的同学基本都能遇到系统忽然缓慢,cpu突然飙升,甚至整个应用请求不可用。当出现这种情况下,在不影响数据准确性的前提下,我们应该尽快导出jstack和内存信息,然后重启系统,尽快回复系统的可用性,避免用户体验过差。本文针对cpu飙升问题,提供该问题的排查思路,从而能够快速定位到某线程甚至某快代码导致cpu飙升,从而提供处理该问题的思路。

排查过程

  1. 通过top命令查看cpu飙升的java进程pid
  2. 通过ps -mp [pid] -o thread,tid,time查看该进程下所拥有的线程及各个线程占用cpu的使用率,并且记录cpu使用率过高的线程id号
  3. 将线程id号转换为16进程的数值记为tid_hex
  4. 使用jdk自带jstack监控命令
  5. 使用命令jstack [pid] | grep tid_hex -a100命令输出该线程的堆栈信息
  6. 根据堆栈信息分析代码。

通过以上步骤可以查找出导致cpu飙升的相关代码位置,然后对代码进行code review即可。

工具封装

  1. 以上步骤已经封装为脚本文件,通过以下脚本文件只需要指定进程id即pid即可导出默认前5条导致cpu率过高的堆栈信息。
  2. 已上传github :
./java-thread-top.sh -p pid
#!/bin/bash
# @function
# find out the highest cpu consumed threads of java processes, and print the stack of these threads.
# @github https://github.com/cjunn/script_tool/
# @author cjunn
# @date sun jan 12 2020 21:08:58 gmt+0800
#

pid='';
count=5;

function usage(){
    readonly prog="`basename $0`"
    cat <<eof
usage: ${prog} [option]
find out the highest cpu consumed threads of java processes,
and print the stack of these threads.
example:
  ${prog} -p <pid> -c 5      # show top 5 busy java threads info
output control:
  -p, --pid <java pid>      find out the highest cpu consumed threads from
                            the specified java process.
                            default from all java process.
  -c, --count <num>         set the thread count to show, default is 5.
miscellaneous:
  -h, --help                display this help and exit.
eof
}

#1.collect script parameters
#2.check whether pid exists
if [ $# -gt 0 ];
then
    while true; do
        case "$1" in
        -c|--count)
            count="$2"
            shift 2
            ;;
        -p|--pid)
            pid="$2"
            shift 2
            ;;
        -h|--help)
            usage
            exit 0;
            ;;
        --)
            shift
            break
            ;;
        *)
            shift
            if [ -z "$1" ] ; then
                break
            fi
            ;;
        esac
    done
fi
if  [ ! -n "$pid" ] ;then
    echo "error: -p is empty"
    exit 1;
fi

function worker(){
    #1.query all threads according to pid.
    #2.delete header and first line information.
    #3.according to the second column of cpu to sort, reverse display.
    #4.delete the count + 1 to last column based on the count value.
    #5.get cpu utilization, tid value, thread used time, and assign them to cpu, tid, time respectively.
    #6.perform hex conversion on tid.
    #7.use jdk to monitor all threads of jstack output pid.
    #8.use awk to regularly query the thread information of tid_hex required.
    #9.display the stack information of count before thread busy.
    local whilec=0;
    ps -mp $pid -o thread,tid,time | sed '1,2d' | sort  -k 2 -n -r |sed $[$count+1]',$d' | awk '{print $2,$8,$9}' | while read cpu tid time
    do
            tid_hex=$(printf "%x" $tid);
            echo "====================== tid:${tid}  tid_hex:${tid_hex}  cpu:${cpu}  time:${time} ======================";
            jstack $pid | awk 'begin {rs = "\n\n+";ors = "\n\n"} /'${tid_hex}'/ {print $0}'
            echo "";
            whilec=$[$whilec+1];
    done
    if [ $whilec -eq 0 ] ; then
        echo "error : thread not found, make sure pid exists.";
    fi

}
worker