(两百八十二) 学习Simpleperf
程序员文章站
2022-04-20 20:05:30
...
官方文档
https://developer.android.google.cn/ndk/guides/simpleperf
简要说明
Simpleperf 是一个通用的命令行 CPU 性能剖析工具,包含在面向 Mac、Linux 和 Windows 的 NDK 中。
使用说明
raphael:/ # simpleperf
Usage: simpleperf [common options] subcommand [args_for_subcommand]
common options:
-h/--help Print this help information.
--log <severity> Set the minimum severity of logging. Possible severities
include verbose, debug, warning, info, error, fatal.
Default is info.
--log-to-android-buffer Write log to android log buffer instead of stderr.
--version Print version of simpleperf.
subcommands:
api-collect Collect recording data generated by app api
api-prepare Prepare recording via app api
debug-unwind Debug/test offline unwinding.
dump dump perf record file
help print help information for simpleperf
kmem collect kernel memory allocation information
list list available event types
record record sampling info in perf.data
report report sampling information in perf.data
report-sample report raw sample information in perf.data
stat gather performance counter information
trace-sched Trace system-wide process runtime events.
执行
一开始执行报错
raphael:/ # simpleperf report --sort dso
simpleperf E record_file_reader.cpp:76] failed to open record file 'perf.data': No such file or directory
后来参考https://blog.csdn.net/zsl_oo7/article/details/72799628
加上-o /sdcard/perf.data重定向
实现WiFi的进程
raphael:/ # ps -A | grep wifi
wifi 788 1 2186504 16356 binder_ioctl_write_read 0 S [email protected]
wifi 1201 1 38752 5140 SyS_epoll_wait 0 S wificond
system 1207 1 20992 6816 binder_ioctl_write_read 0 S wifidisplayhalservice
wifi 23135 1 2237784 10460 do_select 0 S wpa_supplicant
raphael:/ # simpleperf record -p 788 --duration 30 -o /sdcard/perf.data
simpleperf I cmd_record.cpp:635] Samples recorded: 368. Samples lost: 0.
raphael:/ # simpleperf report -i /sdcard/perf.data -n --sort dso
Cmdline: /system/bin/simpleperf record -p 788 --duration 30 -o /sdcard/perf.data
Arch: arm64
Event: cpu-cycles (type 0, config 0)
Samples: 368
Event count: 39321745
Overhead Sample Shared Object
36.01% 157 [kernel.kallsyms]
26.45% 89 /apex/com.android.runtime/lib64/bionic/libc.so
25.92% 85 /vendor/lib64/libwifi-hal.so
7.21% 22 /system/lib64/vndk-29/libnl.so
4.08% 14 [vdso]
0.33% 1 /vendor/bin/hw/[email protected]
然后我们看下libwifi-hal的so中哪个方法占据CPU最长
130|raphael:/ # simpleperf report -i /sdcard/perf.data --dsos /vendor/lib64/libwifi-hal.so --sort symbol
Cmdline: /system/bin/simpleperf record -p 788 --duration 30 -o /sdcard/perf.data
Arch: arm64
Event: cpu-cycles (type 0, config 0)
Samples: 85
Event count: 10193523
Overhead Symbol
48.79% rb_write(void*, unsigned char*, unsigned long, int, unsigned long) (.cfi)
11.90% @plt
10.26% process_firmware_prints(hal_info_s*, unsigned char*, unsigned short) (.cfi)
10.10% diag_message_handler(hal_info_s*, nl_msg*) (.cfi)
8.73% ring_buffer_write(rb_info*, unsigned char*, unsigned long, int, unsigned long) (.cfi)
4.11% wifi_event_loop.cfi
2.90% user_sock_message_handler(nl_msg*, void*) (.cfi)
1.23% no_seq_check(nl_msg*, void*)
1.15% ring_buffer_write(rb_info*, unsigned char*, unsigned long, int, unsigned long)
0.84% diag_message_handler(hal_info_s*, nl_msg*)
理解了下record就是记录下当前CPU使用的相关信息,report就是将信息以文本形式解析出来,上面的是按so排序的,官方文档中还有以pid排序的
看下record的使用说明
130|raphael:/ # simpleperf help record
Usage: simpleperf record [options] [--] [command [command-args]]
Gather sampling information of running [command]. And -a/-p/-t option
can be used to change target of sampling information.
The default options are: -e cpu-cycles -f 4000 -o perf.data.
Select monitored threads:
-a System-wide collection.
--app package_name Profile the process of an Android application.
On non-rooted devices, the app must be debuggable,
because we use run-as to switch to the app's context.
-p pid1,pid2,... Record events on existing processes. Mutually exclusive
with -a.
-t tid1,tid2,... Record events on existing threads. Mutually exclusive with -a.
Select monitored event types:
-e event1[:modifier1],event2[:modifier2],...
Select a list of events to record. An event can be:
1) an event name listed in `simpleperf list`;
2) a raw PMU event in rN format. N is a hex number.
For example, r1b selects event number 0x1b.
Modifiers can be added to define how the event should be
monitored. Possible modifiers are:
u - monitor user space events only
k - monitor kernel space events only
--group event1[:modifier],event2[:modifier2],...
Similar to -e option. But events specified in the same --group
option are monitored as a group, and scheduled in and out at the
same time.
--trace-offcpu Generate samples when threads are scheduled off cpu.
Similar to "-c 1 -e sched:sched_switch".
Select monitoring options:
-f freq Set event sample frequency. It means recording at most [freq]
samples every second. For non-tracepoint events, the default
option is -f 4000. A -f/-c option affects all event types
following it until meeting another -f/-c option. For example,
for "-f 1000 cpu-cycles -c 1 -e sched:sched_switch", cpu-cycles
has sample freq 1000, sched:sched_switch event has sample period 1.
-c count Set event sample period. It means recording one sample when
[count] events happen. For tracepoint events, the default option
is -c 1.
--call-graph fp | dwarf[,<dump_stack_size>]
Enable call graph recording. Use frame pointer or dwarf debug
frame as the method to parse call graph in stack.
Default is dwarf,65528.
-g Same as '--call-graph dwarf'.
--clockid clock_id Generate timestamps of samples using selected clock.
Possible values are: realtime, monotonic,
monotonic_raw, boottime, perf. If supported, default
is monotonic, otherwise is perf.
--cpu cpu_item1,cpu_item2,...
Collect samples only on the selected cpus. cpu_item can be cpu
number like 1, or cpu range like 0-3.
--duration time_in_sec Monitor for time_in_sec seconds instead of running
[command]. Here time_in_sec may be any positive
floating point number.
-j branch_filter1,branch_filter2,...
Enable taken branch stack sampling. Each sample captures a series
of consecutive taken branches.
The following filters are defined:
any: any type of branch
any_call: any function call or system call
any_ret: any function return or system call return
ind_call: any indirect branch
u: only when the branch target is at the user level
k: only when the branch target is in the kernel
This option requires at least one branch type among any, any_call,
any_ret, ind_call.
-b Enable taken branch stack sampling. Same as '-j any'.
-m mmap_pages Set the size of the buffer used to receiving sample data from
the kernel. It should be a power of 2. If not set, the max
possible value <= 1024 will be used.
--no-inherit Don't record created child threads/processes.
--cpu-percent <percent> Set the max percent of cpu time used for recording.
percent is in range [1-100], default is 25.
Dwarf unwinding options:
--post-unwind=(yes|no) If `--call-graph dwarf` option is used, then the user's
stack will be recorded in perf.data and unwound while
recording by default. Use --post-unwind=yes to switch
to unwind after recording.
--no-unwind If `--call-graph dwarf` option is used, then the user's stack
will be unwound by default. Use this option to disable the
unwinding of the user's stack.
--no-callchain-joiner If `--call-graph dwarf` option is used, then by default
callchain joiner is used to break the 64k stack limit
and build more complete call graphs. However, the built
call graphs may not be correct in all cases.
--callchain-joiner-min-matching-nodes count
When callchain joiner is used, set the matched nodes needed to join
callchains. The count should be >= 1. By default it is 1.
Recording file options:
--no-dump-kernel-symbols Don't dump kernel symbols in perf.data. By default
kernel symbols will be dumped when needed.
--no-dump-symbols Don't dump symbols in perf.data. By default symbols are
dumped in perf.data, to support reporting in another
environment.
-o record_file_name Set record file name, default is perf.data.
--size-limit SIZE[K|M|G] Stop recording after SIZE bytes of records.
Default is unlimited.
--symfs <dir> Look for files with symbols relative to this directory.
This option is used to provide files with symbol table and
debug information, which are used for unwinding and dumping symbols.
Other options:
--exit-with-parent Stop recording when the process starting
simpleperf dies.
--start_profiling_fd fd_no After starting profiling, write "STARTED" to
<fd_no>, then close <fd_no>.
--stdio-controls-profiling Use stdin/stdout to pause/resume profiling.
--in-app We are already running in the app's context.
--tracepoint-events file_name Read tracepoint events from [file_name] instead of tracefs.
绘制图表
1|raphael:/ # simpleperf help report
Usage: simpleperf report [options]
The default options are: -i perf.data --sort comm,pid,tid,dso,symbol.
-b Use the branch-to addresses in sampled take branches instead of the
instruction addresses. Only valid for perf.data recorded with -b/-j
option.
--children Print the overhead accumulated by appearing in the callchain.
--comms comm1,comm2,... Report only for selected comms.
--dsos dso1,dso2,... Report only for selected dsos.
--full-callgraph Print full call graph. Used with -g option. By default,
brief call graph is printed.
-g [callee|caller] Print call graph. If callee mode is used, the graph
shows how functions are called from others. Otherwise,
the graph shows how functions call others.
Default is caller mode.
-i <file> Specify path of record file, default is perf.data.
--kallsyms <file> Set the file to read kernel symbols.
--max-stack <frames> Set max stack frames shown when printing call graph.
-n Print the sample count for each item.
--no-demangle Don't demangle symbol names.
--no-show-ip Don't show vaddr in file for unknown symbols.
-o report_file_name Set report file name, default is stdout.
--percent-limit <percent> Set min percentage shown when printing call graph.
--pids pid1,pid2,... Report only for selected pids.
--raw-period Report period count instead of period percentage.
--sort key1,key2,... Select keys used to sort and print the report. The
appearance order of keys decides the order of keys used
to sort and print the report.
Possible keys include:
pid -- process id
tid -- thread id
comm -- thread name (can be changed during
the lifetime of a thread)
dso -- shared library
symbol -- function name in the shared library
vaddr_in_file -- virtual address in the shared
library
Keys can only be used with -b option:
dso_from -- shared library branched from
dso_to -- shared library branched to
symbol_from -- name of function branched from
symbol_to -- name of function branched to
The default sort keys are:
comm,pid,tid,dso,symbol
--symbols symbol1;symbol2;... Report only for selected symbols.
--symfs <dir> Look for files with symbols relative to this directory.
--tids tid1,tid2,... Report only for selected tids.
--vmlinux <file> Parse kernel symbols from <file>.
raphael:/ # simpleperf report -g -i /sdcard/perf.data
Cmdline: /system/bin/simpleperf record -p 788 --duration 30 -o /sdcard/perf.data
Arch: arm64
Event: cpu-cycles (type 0, config 0)
Samples: 368
Event count: 39321745
Children Self Command Pid Tid Shared Object Symbol
12.65% 12.65% [email protected] 788 23141 /vendor/lib64/libwifi-hal.so rb_write(void*, unsigned char*, unsigned long, int, unsigned long) (.cfi)
6.90% 6.90% [email protected] 788 23141 /apex/com.android.runtime/lib64/bionic/libc.so pthread_mutex_unlock
5.81% 5.81% [email protected] 788 23141 /apex/com.android.runtime/lib64/bionic/libc.so pthread_mutex_lock
4.08% 4.08% [email protected] 788 23141 [vdso] __kernel_gettimeofday
4.01% 4.01% [email protected] 788 23141 [kernel.kallsyms] el0_sys
3.74% 3.74% [email protected] 788 23141 /apex/com.android.runtime/lib64/bionic/libc.so __memcpy
3.09% 3.09% [email protected] 788 23141 /vendor/lib64/libwifi-hal.so @plt
2.82% 2.82% [email protected] 788 23141 /apex/com.android.runtime/lib64/bionic/libc.so je_free
2.66% 2.66% [email protected] 788 23141 /vendor/lib64/libwifi-hal.so process_firmware_prints(hal_info_s*, unsigned char*, unsigned short) (.cfi)
2.62% 2.62% [email protected] 788 23141 /vendor/lib64/libwifi-hal.so diag_message_handler(hal_info_s*, nl_msg*) (.cfi)
2.26% 2.26% [email protected] 788 23141 /vendor/lib64/libwifi-hal.so ring_buffer_write(rb_info*, unsigned char*, unsigned long, int, unsigned long) (.cfi)
2.18% 2.18% [email protected] 788 23141 [kernel.kallsyms] el0_svc_naked
1.95% 1.95% [email protected] 788 23141 [kernel.kallsyms] cntvct_read_handler
1.92% 1.92% [email protected] 788 23141 [kernel.kallsyms] do_notify_resume
1.89% 1.89% [email protected] 788 23141 [kernel.kallsyms] do_sys_poll
1.78% 1.78% [email protected] 788 23141 [kernel.kallsyms] sock_poll
1.75% 1.75% [email protected] 788 23141 [kernel.kallsyms] do_sysinstr
1.71% 1.71% [email protected] 788 23141 /system/lib64/vndk-29/libnl.so @plt
1.44% 1.44% [email protected] 788 23141 /apex/com.android.runtime/lib64/bionic/libc.so je_malloc
1.40% 1.40% [email protected] 788 23141 [kernel.kallsyms] __fget_light
1.40% 1.40% [email protected] 788 23141 [kernel.kallsyms] datagram_poll
1.34% 1.34% [email protected] 788 23141 [kernel.kallsyms] _raw_spin_unlock_irqrestore
1.29% 1.29% [email protected] 788 23141 [kernel.kallsyms] __arch_copy_to_user
1.12% 1.12% [email protected]vic 788 23141 /system/lib64/vndk-29/libnl.so nlmsg_next
1.06% 1.06% [email protected] 788 23141 /vendor/lib64/libwifi-hal.so wifi_event_loop.cfi
0.99% 0.99% [email protected] 788 23141 /system/lib64/vndk-29/libnl.so nl_recvmsgs_report
0.93% 0.93% [email protected] 788 23141 /apex/com.android.runtime/lib64/bionic/libc.so je_calloc
0.88% 0.88% [email protected] 788 23141 [kernel.kallsyms] __check_object_size
0.81% 0.81% [email protected] 788 23141 /apex/com.android.runtime/lib64/bionic/libc.so memset
0.80% 0.80% [email protected] 788 23141 [kernel.kallsyms] skb_copy_datagram_iter
0.76% 0.76% [email protected] 788 23141 /system/lib64/vndk-29/libnl.so genlmsg_attrdata
0.75% 0.75% [email protected] 788 23141 /vendor/lib64/libwifi-hal.so user_sock_message_handler(nl_msg*, void*) (.cfi)
0.74% 0.74% [email protected] 788 23141 /apex/com.android.runtime/lib64/bionic/libc.so @plt
0.65% 0.65% [email protected] 788 23141 [kernel.kallsyms] __softirqentry_text_start
0.62% 0.62% [email protected] 788 23141 /system/lib64/vndk-29/libnl.so nl_recv
0.61% 0.61% [email protected] 788 23141 /system/lib64/vndk-29/libnl.so nl_socket_get_cb
0.60% 0.60% [email protected] 788 23141 [kernel.kallsyms] sys_ppoll
0.59% 0.59% [email protected] 788 23141 [kernel.kallsyms] fpsimd_load_state
0.59% 0.59% [email protected] 788 23141 /apex/com.android.runtime/lib64/bionic/libc.so gettimeofday
0.56% 0.56% [email protected] 788 23141 [kernel.kallsyms] __wake_up_common_lock
0.54% 0.54% [email protected] 788 23141 [kernel.kallsyms] schedule_hrtimeout_range_clock
0.51% 0.51% [email protected] 788 23141 /system/lib64/vndk-29/libnl.so nlmsg_data
0.49% 0.49% [email protected] 788 23141 [kernel.kallsyms] security_socket_recvmsg
0.48% 0.48% [email protected] 788 23141 [kernel.kallsyms] sys_recvmsg
0.46% 0.46% [email protected] 788 23141 [kernel.kallsyms] pte_map_lock
0.46% 0.46% [email protected] 788 23141 /apex/com.android.runtime/lib64/bionic/libc.so __ppoll
0.46% 0.46% [email protected] 788 23141 [kernel.kallsyms] move_addr_to_user
0.45% 0.45% [email protected] 788 23141 [kernel.kallsyms] memblock_is_map_memory
0.45% 0.45% [email protected] 788 23141 [kernel.kallsyms] avc_lookup
0.43% 0.43% [email protected] 788 23141 /apex/com.android.runtime/lib64/bionic/libc.so free
0.42% 0.42% [email protected] 788 23141 [kernel.kallsyms] fput
0.41% 0.41% [email protected] 788 23141 [kernel.kallsyms] poll_freewait
0.39% 0.39% [email protected] 788 23141 [kernel.kallsyms] htc_rx_completion_handler
0.37% 0.37% [email protected] 788 23141 /apex/com.android.runtime/lib64/bionic/libc.so extent_deregister_impl
0.37% 0.37% [email protected] 788 23141 /apex/com.android.runtime/lib64/bionic/libc.so pthread_getspecific
0.37% 0.37% [email protected] 788 23141 [kernel.kallsyms] unmap_page_range
0.34% 0.34% [email protected] 788 23141 [kernel.kallsyms] skb_recv_datagram
0.34% 0.34% [email protected] 788 23141 /system/lib64/vndk-29/libnl.so nlmsg_set_src
0.33% 0.33% [email protected] 788 23141 [kernel.kallsyms] tasklet_hi_action
0.33% 0.33% [email protected] 788 23141 [kernel.kallsyms] unix_dgram_poll
0.33% 0.33% [email protected] 788 23141 /vendor/bin/hw/[email protected] android::hardware::wifi::V1_3::implementation::Ringbuffer::append(std::__1::vector<unsigned char, std::__1::allocator<unsigned char> > const&)
0.33% 0.33% [email protected] 788 23141 [kernel.kallsyms] do_mem_abort
0.32% 0.32% [email protected] 788 23141 [kernel.kallsyms] get_page_from_freelist
0.32% 0.32% [email protected] 788 23141 /vendor/lib64/libwifi-hal.so no_seq_check(nl_msg*, void*)
0.31% 0.31% [email protected] 788 23141 /apex/com.android.runtime/lib64/bionic/libc.so calloc
0.31% 0.31% [email protected] 788 23141 [kernel.kallsyms] __skb_try_recv_datagram
0.31% 0.31% [email protected] 788 23141 [kernel.kallsyms] __rcu_read_unlock
0.31% 0.31% [email protected] 788 23141 [kernel.kallsyms] skb_free_datagram
0.30% 0.30% [email protected] 788 23141 /vendor/lib64/libwifi-hal.so ring_buffer_write(rb_info*, unsigned char*, unsigned long, int, unsigned long)
0.30% 0.30% [email protected] 788 23141 /system/lib64/vndk-29/libnl.so nl_cb_put
0.29% 0.29% [email protected] 788 23141 [kernel.kallsyms] selinux_socket_recvmsg
0.28% 0.28% [email protected] 788 23141 [kernel.kallsyms] ce_per_engine_service
0.28% 0.28% [email protected] 788 23141 [kernel.kallsyms] put_pid
0.28% 0.28% [email protected] 788 23141 [kernel.kallsyms] free_hot_cold_page
0.28% 0.28% [email protected] 788 23141 [kernel.kallsyms] lru_add_drain_cpu
0.28% 0.28% [email protected] 788 23141 [kernel.kallsyms] vmacache_find
0.27% 0.27% [email protected] 788 23141 [kernel.kallsyms] __check_heap_object
0.27% 0.27% [email protected] 788 23141 [kernel.kallsyms] queue_work_on
0.26% 0.26% [email protected] 788 23141 [kernel.kallsyms] ___sys_recvmsg
0.26% 0.26% [email protected] 788 23141 [kernel.kallsyms] tcs_notify_tx_done
0.25% 0.25% [email protected] 788 23141 [kernel.kallsyms] netlink_recvmsg
0.25% 0.25% [email protected] 788 23141 /apex/com.android.runtime/lib64/bionic/libc.so poll
0.24% 0.24% [email protected] 788 23141 /system/lib64/vndk-29/libnl.so genlmsg_attrlen
0.24% 0.24% [email protected] 788 23141 /apex/com.android.runtime/lib64/bionic/libc.so je_extent_avail_first
0.23% 0.23% [email protected] 788 23141 /apex/com.android.runtime/lib64/bionic/libc.so malloc
0.22% 0.22% [email protected] 788 23141 /vendor/lib64/libwifi-hal.so diag_message_handler(hal_info_s*, nl_msg*)
0.21% 0.21% [email protected] 788 23141 [kernel.kallsyms] copy_msghdr_from_user
0.20% 0.20% [email protected] 788 23141 [kernel.kallsyms] avtab_search_node
0.05% 0.05% [email protected] 788 23141 [kernel.kallsyms] rw_copy_check_uvector
0.02% 0.02% [email protected] 788 23141 [kernel.kallsyms] _raw_spin_unlock_irq
0.01% 0.01% [email protected] 788 23141 [kernel.kallsyms] finish_task_switch
0.01% 0.01% [email protected] 788 23141 [kernel.kallsyms] __schedule