欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

Android异常崩溃处理流程-源码分析

程序员文章站 2024-02-13 20:50:40
...

1、前言

在文章开篇,我们抛出两个问题:

  • 当我们的应用发生crash或是anr的时候,系统框架做了什么?
  • 我们是否可以接收系统监控到的应用崩溃,并进行记录和上传呢?

而要解释这个问题,就不得不从进程的启动开始讲起,因为监控也是在这个时候开始的..

2、源码分析

2.1、启动app进程

有深入了解过android底层的同学应该会都知道,Android进程都是由zygote孵化而来。而启动zygote和其他Java程序的应用程序, 代码位于frameworks/base/cmds/app_process/app_main.cpp

关键代码:

int main(int argc, char* const argv[])
{
    ....
    if (strcmp(arg, "--zygote") == 0) {
        zygote = true;
        niceName = ZYGOTE_NICE_NAME;
    } else if (strcmp(arg, "--start-system-server") == 0) {
        startSystemServer = true;
    } else if (strcmp(arg, "--application") == 0) {
        application = true;
    } else if (strncmp(arg, "--nice-name=", 12) == 0) {
        niceName.setTo(arg + 12);
    }
    
    ....
    if (zygote) {
        runtime.start("com.android.internal.os.ZygoteInit", args, zygote);
    } else if (className) {
        runtime.start("com.android.internal.os.RuntimeInit", args, zygote);
    } else {
        fprintf(stderr, "Error: no class name or --zygote supplied.\n");
        app_usage();
        LOG_ALWAYS_FATAL("app_process: no class name or --zygote supplied.");
    }
}

可以看到,app_process 里面定义了三种应用程序类型:

  1. Zygote: com.android.internal.os.ZygoteInit
  2. System Server, 不单独启动,而是由Zygote启动
  3. 其他指定类名的Java 程序,比如说常用的 am. /system/bin/am ,其实是一个shell程序,它的真正实现是:
    exec app_process @"

根据传入参数的不同可以有两种启动方式,一个是 "com.android.internal.os.RuntimeInit", 另一个是 ”com.android.internal.os.ZygoteInit", 对应RuntimeInit 和 ZygoteInit 两个类, 图中用绿色和粉红色分别表示。这两个类的主要区别在于Java端,可以明显看出,ZygoteInit 相比 RuntimeInit 多做了很多事情,比如说 “preload", "gc" 等等。但是在Native端,他们都做了相同的事, startVM() 和 startReg().而我们的目的是在这两类的java端实现中.

2.2、 ZygoteInit

当VM准备就绪,就可以运行Java代码了,系统也将在此第一次进入Java世界,还记得app_main.cpp里面调到的 Runtime.start()的参数吗, 那就是我们要运行的Java类。Android支持两个类做为起点,一个是‘com.android.internal.os.ZygoteInit', 另一个是'com.android.internal.os.RuntimeInit'。

此外Runtime_Init 类里还定义了一个ZygoteInit() 静态方法。它在Zygote 创建一个新的应用进程的时候被创建,它和RuntimeInit 类的main() 函数做了以下相同的事情:

  • redirectLogStreams(): 将System.out 和 System.err 输出重定向到Android 的Log系统(定义在 android.util.Log).

  • commonInit(): 初始化了一下系统属性,其中最重要的一点就是设置了一个未捕捉异常的handler,当代码有任何未知异常,就会执行它

RuntimeInit:

 protected static final void commonInit() {
        if (DEBUG) Slog.d(TAG, "Entered RuntimeInit!");

        /*
         * set handlers; these apply to all threads in the VM. Apps can replace
         * the default handler, but not the pre handler.
         */
        LoggingHandler loggingHandler = new LoggingHandler();
        Thread.setUncaughtExceptionPreHandler(loggingHandler);
        Thread.setDefaultUncaughtExceptionHandler(new KillApplicationHandler(loggingHandler));

        /*
       ...
    }

调试过Android代码的同学经常看到的"*** FATAL EXCEPTION IN SYSTEM PROCESS" 打印就出自LoggingHandler这里:

 private static class LoggingHandler implements Thread.UncaughtExceptionHandler {
        public volatile boolean mTriggered = false;

        @Override
        public void uncaughtException(Thread t, Throwable e) {
            ...
            // Don't re-enter if KillApplicationHandler has already run
            if (mCrashing) return;

            // mApplicationObject is null for non-zygote java programs (e.g. "am")
            // There are also apps running with the system UID. We don't want the
            // first clause in either of these two cases, only for system_server.
            if (mApplicationObject == null && (Process.SYSTEM_UID == Process.myUid())) {
                Clog_e(TAG, "*** FATAL EXCEPTION IN SYSTEM PROCESS: " + t.getName(), e);
            } else {
                StringBuilder message = new StringBuilder();
                // The "FATAL EXCEPTION" string is still used on Android even though
                // apps can set a custom UncaughtExceptionHandler that renders uncaught
                // exceptions non-fatal.
                message.append("FATAL EXCEPTION: ").append(t.getName()).append("\n");
                final String processName = ActivityThread.currentProcessName();
                if (processName != null) {
                    message.append("Process: ").append(processName).append(", ");
                }
                message.append("PID: ").append(Process.myPid());
                Clog_e(TAG, message.toString(), e);
            }
        }
    }

KillApplicationHandler:

 private static class KillApplicationHandler implements Thread.UncaughtExceptionHandler {
        private final LoggingHandler mLoggingHandler;

        public KillApplicationHandler(LoggingHandler loggingHandler) {
            this.mLoggingHandler = Objects.requireNonNull(loggingHandler);
        }

        @Override
        public void uncaughtException(Thread t, Throwable e) {
            try {
                ...

                // Don't re-enter -- avoid infinite loops if crash-reporting crashes.
                if (mCrashing) return;
                mCrashing = true;

                ...
                //关键代码
                // Bring up crash dialog, wait for it to be dismissed
                ActivityManager.getService().handleApplicationCrash(
                        mApplicationObject, new ApplicationErrorReport.ParcelableCrashInfo(e));
            } catch (Throwable t2) {
               ...
            } finally {
                // Try everything to make sure this process goes away.
                Process.killProcess(Process.myPid());
                System.exit(10);
            }
        }

ActivityManager.getService().handleApplicationCrash-->ActivityManagerService.handleApplicationCrash,此处转到ams进行处理。

从这里的调用函数我们也可以了看出,handleApplicationCrash,是处理应用crash的,但其实还有native_crash、ANR、wtf,下面我们就来说说各种类型的崩溃处理流程.

2.3、Application crash

ActivityManagerService:

  public void handleApplicationCrash(IBinder app,
            ApplicationErrorReport.ParcelableCrashInfo crashInfo) {
        ProcessRecord r = findAppProcess(app, "Crash");
        final String processName = app == null ? "system_server"
                : (r == null ? "unknown" : r.processName);

        handleApplicationCrashInner("crash", r, processName, crashInfo);
    }

从这里可以看出,若传入app为null时,processName就设置为system_server

 void handleApplicationCrashInner(String eventType, ProcessRecord r, String processName,
            ApplicationErrorReport.CrashInfo crashInfo) {
        ...
        //关键
        addErrorToDropBox(eventType, r, processName, null, null, null, null, null, crashInfo);
        if (r == null || r.pid == MY_PID) {
            if ( "crash".equals(eventType) ) {
                dumpErrorInfo(processName, MY_PID, 2, 2);
                SystemClock.sleep(3000);
            }
        } else {
          
        }

        mAppErrors.crashApplication(r, crashInfo);
    }

上面会调用addErrorToDropBox将应用crash,进行封装输出,这里我们先不说addErrorToDropBox的具体逻辑,先说说其他类型的异常崩溃,你会发现,最终都是调用addErrorToDropBox方法。

2.4、native_crash

native_crash,顾名思义,就是native层发生的crash。其实他是通过一个NativeCrashListener线程去监控的。

final class NativeCrashListener extends Thread {
    ...

    /*
     * Daemon thread that accept()s incoming domain socket connections from debuggerd
     * and processes the crash dump that is passed through.
     */
    NativeCrashListener(ActivityManagerService am) {
        mAm = am;
    }

    @Override
    public void run() {
        final byte[] ackSignal = new byte[1];

       ...

        // The file system entity for this socket is created with 0777 perms, owned
        // by system:system. selinux restricts things so that only crash_dump can
        // access it.
        {
            File socketFile = new File(DEBUGGERD_SOCKET_PATH);
            if (socketFile.exists()) {
                socketFile.delete();
            }
        }

        try {
            FileDescriptor serverFd = Os.socket(AF_UNIX, SOCK_STREAM, 0);
            final UnixSocketAddress sockAddr = UnixSocketAddress.createFileSystem(
                    DEBUGGERD_SOCKET_PATH);
            Os.bind(serverFd, sockAddr);
            Os.listen(serverFd, 1);
            Os.chmod(DEBUGGERD_SOCKET_PATH, 0777);
            
            //1.一直循环地读peerFd文件,若发生存在,则进入consumeNativeCrashData
            while (true) {
                FileDescriptor peerFd = null;
                try {
                    if (MORE_DEBUG) Slog.v(TAG, "Waiting for debuggerd connection");
                    peerFd = Os.accept(serverFd, null /* peerAddress */);
                    if (MORE_DEBUG) Slog.v(TAG, "Got debuggerd socket " + peerFd);
                    if (peerFd != null) {
                        // the reporting thread may take responsibility for
                        // acking the debugger; make sure we play along.
                        //2.进入native crash数据处理流程
                        consumeNativeCrashData(peerFd);
                    }
                } catch (Exception e) {
                    Slog.w(TAG, "Error handling connection", e);
                } finally {
                    ...
                }
            }
        } catch (Exception e) {
            Slog.e(TAG, "Unable to init native debug socket!", e);
        }
    }

  

    // Read a crash report from the connection
    void consumeNativeCrashData(FileDescriptor fd) {
        try {
                ...
                //3.启动NativeCrashReporter作为上报错误的新线程
                final String reportString = new String(os.toByteArray(), "UTF-8");
                (new NativeCrashReporter(pr, signal, reportString)).start();
             
        } catch (Exception e) {
            ...
        }
    }

}

上报native_crash的线程-->NativeCrashReporter:

 class NativeCrashReporter extends Thread {
        ProcessRecord mApp;
        int mSignal;
        String mCrashReport;

        NativeCrashReporter(ProcessRecord app, int signal, String report) {
            super("NativeCrashReport");
            mApp = app;
            mSignal = signal;
            mCrashReport = report;
        }

        @Override
        public void run() {
            try {
                //1.包装崩溃信息
                CrashInfo ci = new CrashInfo();
                ci.exceptionClassName = "Native crash";
                ci.exceptionMessage = Os.strsignal(mSignal);
                ci.throwFileName = "unknown";
                ci.throwClassName = "unknown";
                ci.throwMethodName = "unknown";
                ci.stackTrace = mCrashReport;

                if (DEBUG) Slog.v(TAG, "Calling handleApplicationCrash()");
                //2.转到ams中处理,跟普通crash一致,只是类型不一样
                mAm.handleApplicationCrashInner("native_crash", mApp, mApp.processName, ci);
                if (DEBUG) Slog.v(TAG, "<-- handleApplicationCrash() returned");
            } catch (Exception e) {
                Slog.e(TAG, "Unable to report native crash", e);
            }
        }
    }

native crash跟到这里就结束了,后面的流程就是跟application crash一样,都会走到addErrorToDropBox中,这个我们后面再分析。

2.5、ANR

相信做android的同学都知道anr这个让人郁闷的东西,而发生的场景,也有好几种,比如Activity、Service、Broadcast。因为篇幅有限,我们这里就不讨论每种anr发生后的原因和具体的流程了,直接跳到已经触发ANR的位置。

AppErrors.appNotResponding:

final void appNotResponding(ProcessRecord app, ActivityRecord activity,
            ActivityRecord parent, boolean aboveSystem, final String annotation) {
        ArrayList<Integer> firstPids = new ArrayList<Integer>(5);
        SparseArray<Boolean> lastPids = new SparseArray<Boolean>(20);

        if (mService.mController != null) {
            try {
                //1.判断是否继续后面的流程,还是直接kill掉当前进程 
                // 0 == continue, -1 = kill process immediately
                int res = mService.mController.appEarlyNotResponding(
                        app.processName, app.pid, annotation);
                if (res < 0 && app.pid != MY_PID) {
                    app.kill("anr", true);
                }
            } catch (RemoteException e) {
                mService.mController = null;
                Watchdog.getInstance().setActivityController(null);
            }
        }

        //2.记录发生anr的时间
        long anrTime = SystemClock.uptimeMillis();
        //3.更新cpu使用情况
        if (ActivityManagerService.MONITOR_CPU_USAGE) {
            mService.updateCpuStatsNow();
        }

        //可以在设置中设置发生anr后,是弹框显示还是后台处理,默认是后台
        // Unless configured otherwise, swallow ANRs in background processes & kill the process.
        boolean showBackground = Settings.Secure.getInt(mContext.getContentResolver(),
                Settings.Secure.ANR_SHOW_BACKGROUND, 0) != 0;

        boolean isSilentANR;

        synchronized (mService) {
            ...

            // In case we come through here for the same app before completing
            // this one, mark as anring now so we will bail out.
            app.notResponding = true;

            //3.将anr写入event log中
            EventLog.writeEvent(EventLogTags.AM_ANR, app.userId, app.pid,
                    app.processName, app.info.flags, annotation);

            // Dump thread traces as quickly as we can, starting with "interesting" processes.
            firstPids.add(app.pid);

            // Don't dump other PIDs if it's a background ANR
            isSilentANR = !showBackground && !isInterestingForBackgroundTraces(app);
            if (!isSilentANR) {
                int parentPid = app.pid;
                if (parent != null && parent.app != null && parent.app.pid > 0) {
                    parentPid = parent.app.pid;
                }
                if (parentPid != app.pid) firstPids.add(parentPid);

                if (MY_PID != app.pid && MY_PID != parentPid) firstPids.add(MY_PID);

                for (int i = mService.mLruProcesses.size() - 1; i >= 0; i--) {
                    ProcessRecord r = mService.mLruProcesses.get(i);
                    if (r != null && r.thread != null) {
                        int pid = r.pid;
                        if (pid > 0 && pid != app.pid && pid != parentPid && pid != MY_PID) {
                            if (r.persistent) {
                                firstPids.add(pid);
                                if (DEBUG_ANR) Slog.i(TAG, "Adding persistent proc: " + r);
                            } else if (r.treatLikeActivity) {
                                firstPids.add(pid);
                                if (DEBUG_ANR) Slog.i(TAG, "Adding likely IME: " + r);
                            } else {
                                lastPids.put(pid, Boolean.TRUE);
                                if (DEBUG_ANR) Slog.i(TAG, "Adding ANR proc: " + r);
                            }
                        }
                    }
                }
            }
            
        }

        // 4.将主要的anr信息写到main.log中
        StringBuilder info = new StringBuilder();
        info.setLength(0);
        info.append("ANR in ").append(app.processName);
        if (activity != null && activity.shortComponentName != null) {
            info.append(" (").append(activity.shortComponentName).append(")");
        }
        info.append("\n");
        info.append("PID: ").append(app.pid).append("\n");
        if (annotation != null) {
            info.append("Reason: ").append(annotation).append("\n");
        }
        if (parent != null && parent != activity) {
            info.append("Parent: ").append(parent.shortComponentName).append("\n");
        }
        
        ProcessCpuTracker processCpuTracker = new ProcessCpuTracker(true);
        ArrayList<Integer> nativePids = null;

        // don't dump native PIDs for background ANRs unless it is the process of interest
        String[] nativeProc = null;
        if (isSilentANR) {
            for (int i = 0; i < NATIVE_STACKS_OF_INTEREST.length; i++) {
                if (NATIVE_STACKS_OF_INTEREST[i].equals(app.processName)) {
                    nativeProc = new String[] { app.processName };
                    break;
                }
            }
            int[] pid = nativeProc == null ? null : Process.getPidsForCommands(nativeProc);
            if(pid != null){
                nativePids = new ArrayList<Integer>(pid.length);
                for (int i : pid) {
                    nativePids.add(i);
                }
            }
        } else {
            nativePids = Watchdog.getInstance().getInterestingNativePids();
        }

        //5.dump出stacktraces文件
        // For background ANRs, don't pass the ProcessCpuTracker to
        // avoid spending 1/2 second collecting stats to rank lastPids.
        File tracesFile = ActivityManagerService.dumpStackTraces(
                true, firstPids,
                (isSilentANR) ? null : processCpuTracker,
                (isSilentANR) ? null : lastPids,
                nativePids);

        String cpuInfo = null;
        if (ActivityManagerService.MONITOR_CPU_USAGE) {
            //6.再次更新cpu使用情况
            mService.updateCpuStatsNow();
            synchronized (mService.mProcessCpuTracker) {
                //7.打印anr时cpu使用状态
                cpuInfo = mService.mProcessCpuTracker.printCurrentState(anrTime);
            }
            info.append(processCpuTracker.printCurrentLoad());
            info.append(cpuInfo);
        }

        info.append(processCpuTracker.printCurrentState(anrTime));

        //8.当traces文件不存在时,只能打印线程日志了
        if (tracesFile == null) {
            // There is no trace file, so dump (only) the alleged culprit's threads to the log
            Process.sendSignal(app.pid, Process.SIGNAL_QUIT);
        }

        ...
        //9.关键,回到了我们熟悉的addErrorToDropBox,进行错误信息包装跟上传了
        mService.addErrorToDropBox("anr", app, app.processName, activity, parent, annotation,
                cpuInfo, tracesFile, null);

        if (mService.mController != null) {
            try {
                //10.根据appNotResponding返回结果,看是否继续等待,还是结束当前进程
                // 0 == show dialog, 1 = keep waiting, -1 = kill process immediately
                int res = mService.mController.appNotResponding(
                        app.processName, app.pid, info.toString());
                if (res != 0) {
                    if (res < 0 && app.pid != MY_PID) {
                        app.kill("anr", true);
                    } else {
                        synchronized (mService) {
                            mService.mServices.scheduleServiceTimeoutLocked(app);
                        }
                    }
                    return;
                }
            } catch (RemoteException e) {
                mService.mController = null;
                Watchdog.getInstance().setActivityController(null);
            }
        }

       ...
    }

我们来看一下traces文件是怎么dump出来的:

    public static File dumpStackTraces(boolean clearTraces, ArrayList<Integer> firstPids,
            ProcessCpuTracker processCpuTracker, SparseArray<Boolean> lastPids,
            ArrayList<Integer> nativePids) {
        ArrayList<Integer> extraPids = null;

        //1.测量CPU的使用情况,以便在请求时对*用户进行实际的采样。
        if (processCpuTracker != null) {
            processCpuTracker.init();
            try {
                Thread.sleep(200);
            } catch (InterruptedException ignored) {
            }

            processCpuTracker.update();

            // 2.爬取*应用到的cpu使用情况
            final int N = processCpuTracker.countWorkingStats();
            extraPids = new ArrayList<>();
            for (int i = 0; i < N && extraPids.size() < 5; i++) {
                ProcessCpuTracker.Stats stats = processCpuTracker.getWorkingStats(i);
                if (lastPids.indexOfKey(stats.pid) >= 0) {
                    if (DEBUG_ANR) Slog.d(TAG, "Collecting stacks for extra pid " + stats.pid);

                    extraPids.add(stats.pid);
                } else if (DEBUG_ANR) {
                    Slog.d(TAG, "Skipping next CPU consuming process, not a java proc: "
                            + stats.pid);
                }
            }
        }
        
        //3.读取trace文件的保存目录
        File tracesFile;
        final String tracesDirProp = SystemProperties.get("dalvik.vm.stack-trace-dir", "");
        if (tracesDirProp.isEmpty()) {
            ...
            String globalTracesPath = SystemProperties.get("dalvik.vm.stack-trace-file", null);
            ...
        } else {
           ...
        }

        //4.传入指定目录,进入实际dump逻辑
        dumpStackTraces(tracesFile.getAbsolutePath(), firstPids, nativePids, extraPids,
                useTombstonedForJavaTraces);
        return tracesFile;
    }

dumpStackTraces:

  private static void dumpStackTraces(String tracesFile, ArrayList<Integer> firstPids,
            ArrayList<Integer> nativePids, ArrayList<Integer> extraPids,
            boolean useTombstonedForJavaTraces) {

        ...
        final DumpStackFileObserver observer;
        if (useTombstonedForJavaTraces) {
            observer = null;
        } else {
            // Use a FileObserver to detect when traces finish writing.
            // The order of traces is considered important to maintain for legibility.
            observer = new DumpStackFileObserver(tracesFile);
        }

        //我们必须在20秒内完成所有堆栈转储。
        long remainingTime = 20 * 1000;
        try {
            if (observer != null) {
                observer.startWatching();
            }

            // 首先收集所有最重要的pid堆栈。
            if (firstPids != null) {
                int num = firstPids.size();
                for (int i = 0; i < num; i++) {
                    if (DEBUG_ANR) Slog.d(TAG, "Collecting stacks for pid "
                            + firstPids.get(i));
                    final long timeTaken;
                    if (useTombstonedForJavaTraces) {
                        timeTaken = dumpJavaTracesTombstoned(firstPids.get(i), tracesFile, remainingTime);
                    } else {
                        timeTaken = observer.dumpWithTimeout(firstPids.get(i), remainingTime);
                    }

                    remainingTime -= timeTaken;
                    if (remainingTime <= 0) {
                        Slog.e(TAG, "Aborting stack trace dump (current firstPid=" + firstPids.get(i) +
                            "); deadline exceeded.");
                        return;
                    }

                    if (DEBUG_ANR) {
                        Slog.d(TAG, "Done with pid " + firstPids.get(i) + " in " + timeTaken + "ms");
                    }
                }
            }

            //接下来收集native pid的堆栈
            if (nativePids != null) {
                for (int pid : nativePids) {
                    if (DEBUG_ANR) Slog.d(TAG, "Collecting stacks for native pid " + pid);
                    final long nativeDumpTimeoutMs = Math.min(NATIVE_DUMP_TIMEOUT_MS, remainingTime);

                    final long start = SystemClock.elapsedRealtime();
                    Debug.dumpNativeBacktraceToFileTimeout(
                            pid, tracesFile, (int) (nativeDumpTimeoutMs / 1000));
                    final long timeTaken = SystemClock.elapsedRealtime() - start;

                    remainingTime -= timeTaken;
                    if (remainingTime <= 0) {
                        Slog.e(TAG, "Aborting stack trace dump (current native pid=" + pid +
                            "); deadline exceeded.");
                        return;
                    }

                    if (DEBUG_ANR) {
                        Slog.d(TAG, "Done with native pid " + pid + " in " + timeTaken + "ms");
                    }
                }
            }

            // 最后,从CPU跟踪器转储所有额外PID的堆栈。
            if (extraPids != null) {
                for (int pid : extraPids) {
                    if (DEBUG_ANR) Slog.d(TAG, "Collecting stacks for extra pid " + pid);

                    final long timeTaken;
                    if (useTombstonedForJavaTraces) {
                        timeTaken = dumpJavaTracesTombstoned(pid, tracesFile, remainingTime);
                    } else {
                        timeTaken = observer.dumpWithTimeout(pid, remainingTime);
                    }

                    remainingTime -= timeTaken;
                    if (remainingTime <= 0) {
                        Slog.e(TAG, "Aborting stack trace dump (current extra pid=" + pid +
                                "); deadline exceeded.");
                        return;
                    }

                    if (DEBUG_ANR) {
                        Slog.d(TAG, "Done with extra pid " + pid + " in " + timeTaken + "ms");
                    }
                }
            }
        } finally {
            if (observer != null) {
                observer.stopWatching();
            }
        }
    }

看完之后,应该可以很清楚地的明白。ANR的流程就是打印一些 ANR reason、cpu stats、线程日志,然后分别写入main.log、event.log,然后调用到addErrorToDropBox中,最后kill该进程。

2.6、wtf

同样在RuntimeInit中,存在一个wtf的方法,该方法主要由Log.wtf调用,主要用报告一个永远不可能发生的情况。毕竟wtf=what a terrible failure,都开始爆粗了,这个问题可不得了。

 public static void wtf(String tag, Throwable t, boolean system) {
        try {
            if (ActivityManager.getService().handleApplicationWtf(
                    mApplicationObject, tag, system,
                    new ApplicationErrorReport.ParcelableCrashInfo(t))) {
                // The Activity Manager has already written us off -- now exit.
                Process.killProcess(Process.myPid());
                System.exit(10);
            }
        } catch (Throwable t2) {
            if (t2 instanceof DeadObjectException) {
                // System process is dead; ignore
            } else {
                Slog.e(TAG, "Error reporting WTF", t2);
                Slog.e(TAG, "Original WTF:", t);
            }
        }
    }

此处又转到AMS中处理:

 public boolean handleApplicationWtf(final IBinder app, final String tag, boolean system,
            final ApplicationErrorReport.ParcelableCrashInfo crashInfo) {
        ...

        if (system) {
            // If this is coming from the system, we could very well have low-level
            // system locks held, so we want to do this all asynchronously.  And we
            // never want this to become fatal, so there is that too.
            mHandler.post(new Runnable() {
                @Override public void run() {
                    handleApplicationWtfInner(callingUid, callingPid, app, tag, crashInfo);
                }
            });
            return false;
        }
        //无论system是true或是false,都会调用handleApplicationWtfInner
        final ProcessRecord r = handleApplicationWtfInner(callingUid, callingPid, app, tag,
                crashInfo);

        final boolean isFatal = Build.IS_ENG || Settings.Global
                .getInt(mContext.getContentResolver(), Settings.Global.WTF_IS_FATAL, 0) != 0;
        final boolean isSystem = (r == null) || r.persistent;

        if (isFatal && !isSystem) {
            //将再进入普通应用crash的流程
            mAppErrors.crashApplication(r, crashInfo);
            return true;
        } else {
            return false;
        }
    }

handleApplicationWtfInner:

   ProcessRecord handleApplicationWtfInner(int callingUid, int callingPid, IBinder app, String tag,
            final ApplicationErrorReport.CrashInfo crashInfo) {
        final ProcessRecord r = findAppProcess(app, "WTF");
        final String processName = app == null ? "system_server"
                : (r == null ? "unknown" : r.processName);

        EventLog.writeEvent(EventLogTags.AM_WTF, UserHandle.getUserId(callingUid), callingPid,
                processName, r == null ? -1 : r.info.flags, tag, crashInfo.exceptionMessage);

        StatsLog.write(StatsLog.WTF_OCCURRED, callingUid, tag, processName,
                callingPid);

        //关键,最终走到了我们的addErrorToDropBox
        addErrorToDropBox("wtf", r, processName, null, null, tag, null, null, crashInfo);

        return r;
    }

2.7、殊途同归的addErrorToDropBox

为什么说addErrorToDropBox是殊途同归呢,因为无论是crash、native_crash、ANR或是wtf,最终都是来到这里,交由它去处理。那下面我们就来揭开它的神秘面纱吧。

    public void addErrorToDropBox(String eventType,
            ProcessRecord process, String processName, ActivityRecord activity,
            ActivityRecord parent, String subject,
            final String report, final File dataFile,
            final ApplicationErrorReport.CrashInfo crashInfo) {
        // NOTE -- this must never acquire the ActivityManagerService lock,
        // otherwise the watchdog may be prevented from resetting the system.

        // Bail early if not published yet
        if (ServiceManager.getService(Context.DROPBOX_SERVICE) == null) return;
        final DropBoxManager dbox = mContext.getSystemService(DropBoxManager.class);

        //只有这几种类型的错误,才会进行上传
        final boolean shouldReport = ("anr".equals(eventType)
                || "crash".equals(eventType)
                || "native_crash".equals(eventType)
                || "watchdog".equals(eventType));

        // Exit early if the dropbox isn't configured to accept this report type.
        final String dropboxTag = processClass(process) + "_" + eventType;
        //1.如果DropBoxManager没有初始化,或不是要上传的类型,则直接返回
        if (dbox == null || !dbox.isTagEnabled(dropboxTag)&& !shouldReport) 
            return;

        ...

        final StringBuilder sb = new StringBuilder(1024);
        //2.添加一些头部log信息
        appendDropBoxProcessHeaders(process, processName, sb);
        //3.添加崩溃进程和界面的信息
        try {
            if (process != null) {
                //添加是否前台前程log
                sb.append("Foreground: ")
                        .append(process.isInterestingToUserLocked() ? "Yes" : "No")
                        .append("\n");
            }
            //触发该崩溃的界面,可以为null
            if (activity != null) {
                sb.append("Activity: ").append(activity.shortComponentName).append("\n");
            }
            if (parent != null && parent.app != null && parent.app.pid != process.pid) {
                sb.append("Parent-Process: ").append(parent.app.processName).append("\n");
            }
            if (parent != null && parent != activity) {
                sb.append("Parent-Activity: ").append(parent.shortComponentName).append("\n");
            }
            //定入简要信息
            if (subject != null) {
                sb.append("Subject: ").append(subject).append("\n");
            }
            sb.append("Build: ").append(Build.FINGERPRINT).append("\n");
            //是否连接了调试
            if (Debug.isDebuggerConnected()) {
                sb.append("Debugger: Connected\n");
            }
        } catch (NullPointerException e) {
           e.printStackTrace();
        } finally {
            sb.append("\n");
        }


        final String fProcessName = processName;
        final String fEventType = eventType;
        final String packageName = getErrorReportPackageName(process, crashInfo, eventType);
        Slog.i(TAG,"addErrorToDropbox, real report package is "+packageName);

        // Do the rest in a worker thread to avoid blocking the caller on I/O
        // (After this point, we shouldn't access AMS internal data structures.)
        Thread worker = new Thread("Error dump: " + dropboxTag) {
            @Override
            public void run() {
                //4.添加进程的状态到dropbox中
                BufferedReader bufferedReader = null;
                String line;
                try {
                    bufferedReader = new BufferedReader(new FileReader("/proc/" + pid + "/status"));
                    for (int i = 0; i < 5; i++) {
                        if ((line = bufferedReader.readLine()) != null && line.contains("State")) {
                            sb.append(line + "\n");
                            break;
                        }
                    }
                } catch (IOException e) {
                    e.printStackTrace();
                } finally {
                    if (bufferedReader != null) {
                        try {
                            bufferedReader.close();
                        } catch (IOException e) {
                            e.printStackTrace();
                        }
                    }
                }
               
                if (report != null) {
                    sb.append(report);
                }

                String setting = Settings.Global.ERROR_LOGCAT_PREFIX + dropboxTag;
                int lines = Settings.Global.getInt(mContext.getContentResolver(), setting, 0);
                int maxDataFileSize = DROPBOX_MAX_SIZE - sb.length()
                        - lines * RESERVED_BYTES_PER_LOGCAT_LINE;

                //5.将dataFile文件定入dropbox中,一般只有anr时,会将traces文件通过该参数传递进来者,其他类型都不传.
                if (dataFile != null && maxDataFileSize > 0) {
                    try {
                        sb.append(FileUtils.readTextFile(dataFile, maxDataFileSize,
                                    "\n\n[[TRUNCATED]]"));
                    } catch (IOException e) {
                        Slog.e(TAG, "Error reading " + dataFile, e);
                    }
                }
                
                //6.如果是crash类型,会传入crashInfo,此时将其写入dropbox中
                if (crashInfo != null && crashInfo.stackTrace != null) {
                    sb.append(crashInfo.stackTrace);
                }

                if (lines > 0) {
                    sb.append("\n");

                    // 7.合并几个logcat流,取最新部分log
                    InputStreamReader input = null;
                    try {
                        java.lang.Process logcat = new ProcessBuilder(
                                "/system/bin/timeout", "-k", "15s", "10s",
                                "/system/bin/logcat", "-v", "threadtime", "-b", "events", "-b", "system",
                                "-b", "main", "-b", "crash", "-t", String.valueOf(lines))
                                        .redirectErrorStream(true).start();

                        try { logcat.getOutputStream().close(); } catch (IOException e) {}
                        try { logcat.getErrorStream().close(); } catch (IOException e) {}
                        input = new InputStreamReader(logcat.getInputStream());

                        int num;
                        char[] buf = new char[8192];
                        while ((num = input.read(buf)) > 0) sb.append(buf, 0, num);
                    } catch (IOException e) {
                        Slog.e(TAG, "Error running logcat", e);
                    } finally {
                        if (input != null) try { input.close(); } catch (IOException e) {}
                    }
                }

                ...

                
                if (shouldReport) {
                    synchronized (mErrorListenerLock) {
                        try {
                            if (mIApplicationErrorListener == null) {
                                return;
                            }
                            //8.关键,在这里可以添加一个application error的接口,用来实现应用层接收崩溃信息
                            mIApplicationErrorListener.onError(fEventType,
                                    packageName, fProcessName, subject, dropboxTag
                                            + "-" + uuid, crashInfo);
                        } catch (DeadObjectException e) {
                            Slog.i(TAG, "ApplicationErrorListener.onError() E :" + e, e);
                            mIApplicationErrorListener = null;
                        } catch (Exception e) {
                            Slog.i(TAG, "ApplicationErrorListener.onError() E :" + e, e);
                        }
                    }
                }
            }
        };

        ...
    }

调用appendDropBoxProcessHeaders添加头部log信息:

 private void appendDropBoxProcessHeaders(ProcessRecord process, String processName,
            StringBuilder sb) {
        // Watchdog thread ends up invoking this function (with
        // a null ProcessRecord) to add the stack file to dropbox.
        // Do not acquire a lock on this (am) in such cases, as it
        // could cause a potential deadlock, if and when watchdog
        // is invoked due to unavailability of lock on am and it
        // would prevent watchdog from killing system_server.
        if (process == null) {
            sb.append("Process: ").append(processName).append("\n");
            return;
        }
        // Note: ProcessRecord 'process' is guarded by the service
        // instance.  (notably process.pkgList, which could otherwise change
        // concurrently during execution of this method)
        synchronized (this) {
            sb.append("Process: ").append(processName).append("\n");
            sb.append("PID: ").append(process.pid).append("\n");
            int flags = process.info.flags;
            IPackageManager pm = AppGlobals.getPackageManager();
            //添加该进程的flag
            sb.append("Flags: 0x").append(Integer.toHexString(flags)).append("\n");
            for (int ip=0; ip<process.pkgList.size(); ip++) {
                String pkg = process.pkgList.keyAt(ip);
                sb.append("Package: ").append(pkg);
                try {
                    PackageInfo pi = pm.getPackageInfo(pkg, 0, UserHandle.getCallingUserId());
                    if (pi != null) {
                        sb.append(" v").append(pi.getLongVersionCode());
                        if (pi.versionName != null) {
                            sb.append(" (").append(pi.versionName).append(")");
                        }
                    }
                } catch (RemoteException e) {
                    Slog.e(TAG, "Error getting package info: " + pkg, e);
                }
                sb.append("\n");
            }
            //如果是执行安装的app,会在log中添加此项
            if (process.info.isInstantApp()) {
                sb.append("Instant-App: true\n");
            }
        }
    }

2.8、接收来自框架层的异常监控信息

这里,就可以回答我们开篇讲的第二个问题,如何去接收框架层监控到的应用或系统崩溃事件?

可以在框架提供了一个IApplicationErrorListener的接口,可通过设置该接口,接收系统框架捕获到的应用到崩溃信息。
IApplicationErrorListener:

public interface IApplicationErrorListener extends android.os.IInterface {
    /** Local-side IPC implementation stub class. */
    public static abstract class Stub extends android.os.Binder implements android.app.IApplicationErrorListener {
        private static final java.lang.String DESCRIPTOR = "android.app.IApplicationErrorListener";

        /** Construct the stub at attach it to the interface. */
        public Stub() {
            this.attachInterface(this, DESCRIPTOR);
        }

        /**
         * Cast an IBinder object into an android.app.IApplicationErrorListener
         * interface, generating a proxy if needed.
         */
        public static android.app.IApplicationErrorListener asInterface(android.os.IBinder obj) {
            if ((obj == null)) {
                return null;
            }
            android.os.IInterface iin = obj.queryLocalInterface(DESCRIPTOR);
            if (((iin != null) && (iin instanceof android.app.IApplicationErrorListener))) {
                return ((android.app.IApplicationErrorListener) iin);
            }
            return new android.app.IApplicationErrorListener.Stub.Proxy(obj);
        }

        @Override
        public android.os.IBinder asBinder() {
            return this;
        }

        @Override
        public boolean onTransact(int code, android.os.Parcel data, android.os.Parcel reply, int flags)
                throws android.os.RemoteException {
            switch (code) {
                case INTERFACE_TRANSACTION: {
                    reply.writeString(DESCRIPTOR);
                    return true;
                }
                case TRANSACTION_onError: {
                    data.enforceInterface(DESCRIPTOR);
                    String errorType = data.readString();
                    String packageName = data.readString();
                    String processName = data.readString();
                    String subject = data.readString();
                    String dump = data.readString();
                    ApplicationErrorReport.CrashInfo crashInfo = new ApplicationErrorReport.CrashInfo(data);
                    this.onError(errorType, packageName, processName, subject, dump, crashInfo);
                    reply.writeNoException();
                    return true;
                }
            }
            return super.onTransact(code, data, reply, flags);
        }

        private static class Proxy implements android.app.IApplicationErrorListener {
            private android.os.IBinder mRemote;

            Proxy(android.os.IBinder remote) {
                mRemote = remote;
            }

            @Override
            public android.os.IBinder asBinder() {
                return mRemote;
            }

            public java.lang.String getInterfaceDescriptor() {
                return DESCRIPTOR;
            }

            @Override
            public void onError(String errorType, String packageName, String processName, String subject,
                    String dump, CrashInfo crashInfo) throws android.os.RemoteException {
                android.os.Parcel _data = android.os.Parcel.obtain();
                android.os.Parcel _reply = android.os.Parcel.obtain();
                try {
                    _data.writeInterfaceToken(DESCRIPTOR);
                    _data.writeString(errorType);
                    _data.writeString(packageName);
                    _data.writeString(processName);
                    _data.writeString(subject);
                    _data.writeString(dump);
                    if (null == crashInfo) {
                        crashInfo = new CrashInfo();
                    }
                    crashInfo.writeToParcel(_data, 0);
                    mRemote.transact(Stub.TRANSACTION_onError, _data, _reply, 0);
                    _reply.readException();
                } finally {
                    _reply.recycle();
                    _data.recycle();
                }
            }
        }

        static final int TRANSACTION_onError = (android.os.IBinder.FIRST_CALL_TRANSACTION + 0);
    }

    public void onError(String errorType, String packageName, String processName, String subject,
            String dump, ApplicationErrorReport.CrashInfo crashInfo) throws android.os.RemoteException;
}

该接口的关键方法就是:

  public void onError(String errorType, String packageName, String processName, String subject,
            String dump, ApplicationErrorReport.CrashInfo crashInfo) throws android.os.RemoteException;

最终的错误信息,都会通过该方法进行传递。那只要我们的应用拥有系统权限,就可以往系统添加该回调了。