欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  移动技术

Android系统层 性能监控-卡顿的监控

程序员文章站 2022-09-05 11:43:45
这里写自定义目录标题欢迎使用Markdown编辑器新的改变功能快捷键合理的创建标题,有助于目录的生成如何改变文本的样式插入链接与图片如何插入一段漂亮的代码片生成一个适合你的列表创建一个表格设定内容居中、居左、居右SmartyPants创建一个自定义列表如何创建一个注脚注释也是必不可少的KaTeX数学公式新的甘特图功能,丰富你的文章UML 图表FLowchart流程图导出与导入导出导入欢迎使用Markdown编辑器你好! 这是你第一次使用 Markdown编辑器 所展示的欢迎页。如果你想学习如何使用Mar...

背景

如果作为一名android系统研发工程师,很有可能需要监控系统中所以App的卡顿,以便协助App解决卡顿问题,提升用户体验。当然最主要的是APP研发很难发现每个页面的卡顿,这个时候有系统支持就会发现卡顿的Activity。

基础知识补充

Android屏幕刷新机制
理解Android硬件加速原理的小白文

大概描述下UI绘制一帧的流程

1、无论是resume或者invalidate等刷新UI的接口,最终都调用到了ViewRootImpl.scheduleTraversals

    void scheduleTraversals() {
        if (!mTraversalScheduled) {
            mTraversalScheduled = true;
            mTraversalBarrier = mHandler.getLooper().getQueue().postSyncBarrier();
            mChoreographer.postCallback(
                    Choreographer.CALLBACK_TRAVERSAL, mTraversalRunnable, null);
            if (!mUnbufferedInputDispatch) {
                scheduleConsumeBatchedInput();
            }
            notifyRendererOfFramePending();
            pokeDrawLockIfNeeded();
        }
    }

2、scheduleTraversals方法内部会Choreographer.postCallback,这个是最主要的,看下这个接口的备注,下一帧绘制信号来了会调用这个callback

    /**
     * Posts a callback to run on the next frame.
     * <p>
     * The callback runs once then is automatically removed.
     * </p>
     */
    @TestApi
    public void postCallback(int callbackType, Runnable action, Object token) {
        postCallbackDelayed(callbackType, action, token, 0);
    }

3、mTraversalRunnable就是帧信号回调后调用doTraversal()内部依次performMeasure()、performLayout()、performDraw()
4、Choreographer.postCallback加入的回调什么条件下被执行,代码Choreographer.java

    public void postCallback(int callbackType, Runnable action, Object token) {
        postCallbackDelayed(callbackType, action, token, 0);
    }

    private void scheduleVsyncLocked() {
        mDisplayEventReceiver.scheduleVsync();
    }   

层层传递,最终调用到了mDisplayEventReceiver.scheduleVsync();看一下DisplayEventReceiver.java

    /**
     * Schedules a single vertical sync pulse to be delivered when the next
     * display frame begins.
     */
    public void scheduleVsync() {
        if (mReceiverPtr == 0) {
            Log.w(TAG, "Attempted to schedule a vertical sync pulse but the display event "
                    + "receiver has already been disposed.");
        } else {
            nativeScheduleVsync(mReceiverPtr);
        }
    }

根据接口注释,这个接口会关联这个DisplayEventReceiver实体类安排一个定向的脉冲信号(可以理解成回调),会在下一帧绘制开始时发送。
5、因为nativeScheduleVsync是关联DisplayEventReceiver注册的,所以当收到下一帧绘制信号时会回调onVsync接口

    /**
     * Called when a vertical sync pulse is received.
     * The recipient should render a frame and then call {@link #scheduleVsync}
     * to schedule the next vertical sync pulse.
     *
     * @param timestampNanos The timestamp of the pulse, in the {@link System#nanoTime()}
     * timebase.
     * @param builtInDisplayId The surface flinger built-in display id such as
     * {@link SurfaceControl#BUILT_IN_DISPLAY_ID_MAIN}.
     * @param frame The frame number.  Increases by one for each vertical sync interval.
     */
    public void onVsync(long timestampNanos, int builtInDisplayId, int frame) {
    }

6、Choreographer内部实例化了FrameDisplayEventReceiver,重写了onVsync接口,最终会调用到了Choreographer.doFrame()接口。

    private void postCallbackDelayedInternal(int callbackType,
            Object action, Object token, long delayMillis) {
        synchronized (mLock) {
            final long now = SystemClock.uptimeMillis();
            final long dueTime = now + delayMillis;
            mCallbackQueues[callbackType].addCallbackLocked(dueTime, action, token);
        }
    }

    void doFrame(long frameTimeNanos, int frame) {
        ......
        try {
            Trace.traceBegin(Trace.TRACE_TAG_VIEW, "Choreographer#doFrame");
            AnimationUtils.lockAnimationClock(frameTimeNanos / TimeUtils.NANOS_PER_MS);

            mFrameInfo.markInputHandlingStart();
            doCallbacks(Choreographer.CALLBACK_INPUT, frameTimeNanos);

            mFrameInfo.markAnimationsStart();
            doCallbacks(Choreographer.CALLBACK_ANIMATION, frameTimeNanos);

            mFrameInfo.markPerformTraversalsStart();
            doCallbacks(Choreographer.CALLBACK_TRAVERSAL, frameTimeNanos);

            doCallbacks(Choreographer.CALLBACK_COMMIT, frameTimeNanos);
        } finally {
            AnimationUtils.unlockAnimationClock();
            Trace.traceEnd(Trace.TRACE_TAG_VIEW);
        }
    }

重要的集合mCallbackQueues,这个集合会在postCallback时加入传入的runnable,在doFrame中调用doCallbacks,doCallbacks内部会从mCallbackQueues取出runnable然后执行。最终去执行ViewRootImpl的doTraversal()

硬件加速下,Draw在GPU绘制的流程

1、cpu负责计算,measure,layout都是在主线程进行的,View视图被抽象成RenderNode节点传递到GPU进行绘制。
ViewRootImpl.java

    private boolean draw(boolean fullRedrawNeeded) {
        ....
        <!--关键点1 是否开启硬件加速-->
        if (mAttachInfo.mThreadedRenderer != null && mAttachInfo.mThreadedRenderer.isEnabled()) {
            ....
            <!--关键点2 硬件加速绘制-->
            mAttachInfo.mThreadedRenderer.draw(mView, mAttachInfo, this, callback);
            ....
        }

GPU绘制是借助ThreadedRenderer去绘制的
2、ThreadedRenderer.java的draw方法

    /**
     * Draws the specified view
     */
    void draw(View view, AttachInfo attachInfo, DrawCallbacks callbacks,
            FrameDrawingCallback frameDrawingCallback) {
        attachInfo.mIgnoreDirtyState = true;

        final Choreographer choreographer = attachInfo.mViewRootImpl.mChoreographer;
        choreographer.mFrameInfo.markDrawStart();

        updateRootDisplayList(view, callbacks);
        ....
        final long[] frameInfo = choreographer.mFrameInfo.mFrameInfo;
        if (frameDrawingCallback != null) {
            nSetFrameCallback(mNativeProxy, frameDrawingCallback);
        }
        int syncResult = nSyncAndDrawFrame(mNativeProxy, frameInfo, frameInfo.length);
        if ((syncResult & SYNC_LOST_SURFACE_REWARD_IF_FOUND) != 0) {
            setEnabled(false);
            attachInfo.mViewRootImpl.mSurface.release();
            // Invalidate since we failed to draw. This should fetch a Surface
            // if it is still needed or do nothing if we are no longer drawing
            attachInfo.mViewRootImpl.invalidate();
        }
        if ((syncResult & SYNC_INVALIDATE_REQUIRED) != 0) {
            attachInfo.mViewRootImpl.invalidate();
        }
    }

    private static native int nSyncAndDrawFrame(long nativeProxy, long[] frameInfo, int size);

通过native方法nSyncAndDrawFrame交给底层去绘制了,绘制完成会在调用ViewRootImpl.invalidate()刷新。

底层又如何接管去绘制的

1、nSyncAndDrawFrame接口在frameworks/base/core/jni/android_view_ThreadedRenderer.cpp

static int android_view_ThreadedRenderer_syncAndDrawFrame(JNIEnv* env, jobject clazz,
        jlong proxyPtr, jlongArray frameInfo, jint frameInfoSize) {
    LOG_ALWAYS_FATAL_IF(frameInfoSize != UI_THREAD_FRAME_INFO_SIZE,
            "Mismatched size expectations, given %d expected %d",
            frameInfoSize, UI_THREAD_FRAME_INFO_SIZE);
    RenderProxy* proxy = reinterpret_cast<RenderProxy*>(proxyPtr);
    env->GetLongArrayRegion(frameInfo, 0, frameInfoSize, proxy->frameInfo());
    return proxy->syncAndDrawFrame();
}

2、RenderProxy.syncAndDrawFrame
frameworks/base/libs/hwui/renderthread/RenderProxy.cpp

int RenderProxy::syncAndDrawFrame() {
    return mDrawFrameTask.drawFrame();
}

3、frameworks/base/libs/hwui/renderthread/DrawFrameTask.cpp

int DrawFrameTask::drawFrame() {
    LOG_ALWAYS_FATAL_IF(!mContext, "Cannot drawFrame with no CanvasContext!");

    mSyncResult = SyncResult::OK;
    mSyncQueued = systemTime(CLOCK_MONOTONIC);
    postAndWait();

    return mSyncResult;
}

void DrawFrameTask::postAndWait() {
    AutoMutex _lock(mLock);
    mRenderThread->queue().post([this]() { run(); });
    mSignal.wait(mLock);
}

void DrawFrameTask::run() {
    ......
    context->draw();
    ......
}

4、context是CanvasContext,frameworks/base/libs/hwui/renderthread/CanvasContext.cpp

void CanvasContext::draw() {
    SkRect dirty;
    mDamageAccumulator.finish(&dirty);

    if (dirty.isEmpty() && Properties::skipEmptyFrames && !surfaceRequiresRedraw()) {
        mCurrentFrameInfo->addFlag(FrameInfoFlags::SkippedFrame);
        return;
    }

    mCurrentFrameInfo->markIssueDrawCommandsStart();

    Frame frame = mRenderPipeline->getFrame();
    setPresentTime();

    SkRect windowDirty = computeDirtyRect(frame, &dirty);

    bool drew = mRenderPipeline->draw(frame, windowDirty, dirty, mLightGeometry, &mLayerUpdateQueue,
                                      mContentDrawBounds, mOpaque, mLightInfo, mRenderNodes,
                                      &(profiler()));

    int64_t frameCompleteNr = mFrameCompleteCallbacks.size() ? getFrameNumber() : -1;

    waitOnFences();

    bool requireSwap = false;
    bool didSwap =
            mRenderPipeline->swapBuffers(frame, drew, windowDirty, mCurrentFrameInfo, &requireSwap);

    mIsDirty = false;

    if (requireSwap) {
        if (!didSwap) {  // some error happened
            setSurface(nullptr);
        }
        SwapHistory& swap = mSwapHistory.next();
        swap.damage = windowDirty;
        swap.swapCompletedTime = systemTime(CLOCK_MONOTONIC);
        swap.vsyncTime = mRenderThread.timeLord().latestVsync();
        if (mNativeSurface.get()) {
            int durationUs;
            nsecs_t dequeueStart = mNativeSurface->getLastDequeueStartTime();
            if (dequeueStart < mCurrentFrameInfo->get(FrameInfoIndex::SyncStart)) {
                // Ignoring dequeue duration as it happened prior to frame render start
                // and thus is not part of the frame.
                swap.dequeueDuration = 0;
            } else {
                mNativeSurface->query(NATIVE_WINDOW_LAST_DEQUEUE_DURATION, &durationUs);
                swap.dequeueDuration = us2ns(durationUs);
            }
            mNativeSurface->query(NATIVE_WINDOW_LAST_QUEUE_DURATION, &durationUs);
            swap.queueDuration = us2ns(durationUs);
        } else {
            swap.dequeueDuration = 0;
            swap.queueDuration = 0;
        }
        mCurrentFrameInfo->set(FrameInfoIndex::DequeueBufferDuration) = swap.dequeueDuration;
        mCurrentFrameInfo->set(FrameInfoIndex::QueueBufferDuration) = swap.queueDuration;
        mHaveNewSurface = false;
        mFrameNumber = -1;
    } else {
        mCurrentFrameInfo->set(FrameInfoIndex::DequeueBufferDuration) = 0;
        mCurrentFrameInfo->set(FrameInfoIndex::QueueBufferDuration) = 0;
    }


#if LOG_FRAMETIME_MMA
    float thisFrame = mCurrentFrameInfo->duration(FrameInfoIndex::IssueDrawCommandsStart,
                                                  FrameInfoIndex::FrameCompleted) /
                      NANOS_PER_MILLIS_F;
    if (sFrameCount) {
        sBenchMma = ((9 * sBenchMma) + thisFrame) / 10;
    } else {
        sBenchMma = thisFrame;
    }
    if (++sFrameCount == 10) {
        sFrameCount = 1;
        ALOGD("Average frame time: %.4f", sBenchMma);
    }
#endif

    if (didSwap) {
        for (auto& func : mFrameCompleteCallbacks) {
            std::invoke(func, frameCompleteNr);
        }
        mFrameCompleteCallbacks.clear();
    }

    mJankTracker.finishFrame(*mCurrentFrameInfo);
    if (CC_UNLIKELY(mFrameMetricsReporter.get() != nullptr)) {
        mFrameMetricsReporter->reportFrameMetrics(mCurrentFrameInfo->data());
    }

    GpuMemoryTracker::onFrameCompleted();
}

最终native绘制完会存放到共享内存中,等待Surface通过SwapBuffers获取绘制结果。

绘制流程梳理完了,如何去监控卡顿

不知道大家留意到没,上述的三个大步骤都有个对象一直在记录时间,比如:
mFrameInfo.markPerformTraversalsStart();
mFrameInfo.markInputHandlingStart();
mFrameInfo.markAnimationsStart();等等等等

FrameInfo这个对象会一直被传递,从java层一直到native的CanvasContext。每一个步骤都会mark一个时间,记录下来。
所以,卡顿监控就可以从这个FrameInfo对象获取每个步骤的时间,如果时间超过一帧,其实就是卡顿了。大体思路如何操作呢?
1、从绘制的终点CanvasContext.draw()方法的末尾,传递FrameInfo到自定义service。
2、自定义service通过处理一些时间细节,判断是否掉帧。
3、自定义service通过aidl桥接到app或者直接在自定义service中存储掉帧数据。
4、把处理过得数据汇总,通过网络请求上传到服务器,分析观察。
5、CanvasContext中还能拿到当年绘制的Activity信息,可以把掉帧和窗口关联起来。

FrameInfo字段解析

具体几个绘制名称的含义:

  • IntendedVsync app_vsync的时间
  • Vsync 开始处理vync事件的时间
  • OldestInputEvent 如上处理批量事件中最老的一个inputEvent的时间
  • NewestInputEvent 最新事件
  • HandleInputStart mainthread开始处理input事件的时间
  • AnimationStart mainthread开始处理动画的时间
  • PerformTraversalsStart mainthread开始遍历视图的时间
  • DrawStart mainthread开始执行draw函数的时间
  • SyncQueued 添加事件到ThreadRender的时间
  • SyncStart renderthread开始同步main thread数据的的时间
  • IssueDrawCommandsStart renderThread开始绘制的时间
  • SwapBuffers renderThread开始交换buffer的时间
  • FrameCompleted renderThread 完成交换buffer的时间
  • DequeueBufferDuration renderThread交换buffer中dequeueBuffer花费的时间
  • QueueBufferDuration renderThread queueBuffer的时间

几个重要绘制节点时间间隔的含义(MQS会通过这几个间隔判断是否app自身导致的卡顿):

  • IntendedVsync-Vsync main thread delay time
  • HandleInputStart-AnimationStart handle_input_time interval
  • AnimationStart-PerformTraversalsStart handle_animation_time
  • PerformTraversalsStart-DrawStart handle_traversal_time
  • SyncStart-IssueDrawCommandsStart bitmap_uploads_time
  • IssueDrawCommandsStart-SwapBuffers issue_draw_commands_time

卡顿使用FrameCompleted - IntendedVsync 的时间差作为一次绘制的时长。

如何优化

Android性能优化典范

本文地址:https://blog.csdn.net/archie_7/article/details/110492894

相关标签: Android