欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

【Apollo】supervisor组件的应用

程序员文章站 2022-07-12 11:47:28
...

Supervisor

一个client/server系统,用来控制一系列进程在UNIX-like操作系统上

supervisord(server):响应client端的命令,控制进程启动,停止,监控进程,重新启动崩溃或退出的进程,记录进程日志,生成并处理进程生命周期中的点事件。supervisord使用一个配置文件,通常是/etc/supervisord.conf 里面是一些配置信息。

supervisorctl(client):类似于命令行,向server端发送用户需要的控制命令

server与client通信通过socket通信。

Apollo中supervisor组件的应用

supervisor组件的安装,在docker/build/installers/install_supervisor.sh

# Fail on first error.
set -e

apt-get install -y supervisor
# Add supervisord config file
echo_supervisord_conf > /etc/supervisord.conf

server端

通过查看进程可以看到supervisord已经启动,并且配置文件是/apollo/modules/tools/supervisord/dev.conf

ubuntu@in_dev_docker:/apollo$ ps aux | grep supervisor
root       193  0.2  0.0  49900 14312 ?        Ss   10:29   0:02 /usr/bin/python /usr/local/bin/supervisord -c /apollo/modules/tools/supervisord/dev.conf

启动 : 在scripts/bootstrap.sh

# Setup supervisord.
    if [ "$HOSTNAME" == "in_release_docker" ]; then
        supervisord -c /apollo/modules/tools/supervisord/release.conf >& /tmp/supervisord.start.log
        echo "Started supervisord with release conf"
    else
        supervisord -c /apollo/modules/tools/supervisord/dev.conf >& /tmp/supervisord.start.log
        echo "Started supervisord with dev conf"
    fi

查看dev.conf,以routing为例

[program:routing]
command=/apollo/bazel-bin/modules/routing/routing --flagfile=/apollo/modules/routing/conf/routing.conf ;启动routing的命令
autostart=false ;自动启动
numprocs=1  ;启动进程的实例,启动numprocs个routing进程
exitcodes=0 ;autorestart中用到的expected的退出码
stopsignal=INT ;当进程在stop时,可以kill进程的信号
startretries=10 ;尝试启动次数
autorestart=unexpected ;自动重启,进程在异常退出时会自动重启
redirect_stderr=true ;日志将被输出到stdout_logfile指向的文件中
stdout_logfile=/apollo/data/log/routing.out ;日志文件

client端

在启动dreamview的脚本scripts/bootstrap.sh 中可见dreamview和monitor进程都是由supervisorctl控制启动的。

# Start monitor.
supervisorctl start monitor > /dev/null
# Start dreamview.
bash scripts/voice_detector.sh start
supervisorctl start dreamview > /dev/null
echo "Dreamview is running at http://localhost:8888"

dreamview中是通过supervisorctl来控制不同模块进程的。在modules/dreamview/conf/hmi.conf 文件中。
以localization为例,在hmi上开启与关闭localization的模块开关就会执行以下命令。

modules {
  key: "localization"
  value: {
    display_name: "Localization"
    supported_commands {
      key: "start"
      value: "supervisorctl start localization &"
    }
    supported_commands {
      key: "stop"
      value: "supervisorctl stop localization &"
    }
  }
}

应用

supervisor 可以使用的命令如下:

supervisor> help

default commands (type help <topic>):
=====================================
add    exit      open  reload  restart   start   tail   
avail  fg        pid   remove  shutdown  status  update 
clear  maintail  quit  reread  signal    stop    version
  • 查看进程状态
[email protected]_dev_docker:/apollo$ sudo supervisorctl status
canbus                           STOPPED   Not started
conti_radar                      STOPPED   Not started
control                          STOPPED   Not started
dreamview                        RUNNING   pid 266, uptime 1:08:45
gps                              STOPPED   Not started
localization                     STOPPED   Not started
mobileye                         STOPPED   Not started
monitor                          RUNNING   pid 207, uptime 1:08:46
navigation_control               STOPPED   Not started
navigation_localization          STOPPED   Not started
navigation_perception            STOPPED   Not started
navigation_planning              STOPPED   Not started
navigation_prediction            STOPPED   Not started
navigation_routing               STOPPED   Not started
navigation_server                STOPPED   Not started
perception                       STOPPED   Not started
planning                         STOPPED   Not started
prediction                       STOPPED   Not started
routing                          STOPPED   Not started
third_party_perception           STOPPED   Not started
  • 查看进程日志
[email protected]_dev_docker:/apollo$ sudo supervisorctl                 
canbus                           STOPPED   Not started
conti_radar                      STOPPED   Not started
control                          STOPPED   Not started
dreamview                        RUNNING   pid 266, uptime 1:10:49
gps                              STOPPED   Not started
localization                     STOPPED   Not started
mobileye                         STOPPED   Not started
monitor                          RUNNING   pid 207, uptime 1:10:50
navigation_control               STOPPED   Not started
navigation_localization          STOPPED   Not started
navigation_perception            STOPPED   Not started
navigation_planning              STOPPED   Not started
navigation_prediction            STOPPED   Not started
navigation_routing               STOPPED   Not started
navigation_server                STOPPED   Not started
perception                       STOPPED   Not started
planning                         STOPPED   Not started
prediction                       STOPPED   Not started
routing                          STOPPED   Not started
third_party_perception           STOPPED   Not started
supervisor> tail -f dreamview 
==> Press Ctrl-C to exit <==
pty
E0621 11:40:39.131963   266 hdmap_util.cc:120] RelativeMap is empty
E0621 11:40:39.232152   266 hdmap_util.cc:120] RelativeMap is empty
E0621 11:40:39.332326   266 hdmap_util.cc:120] RelativeMap is empty
E0621 11:40:39.431500   266 hdmap_util.cc:120] RelativeMap is empty
E0621 11:40:39.531697   266 hdmap_util.cc:120] RelativeMap is empty

注意事项:如果想开启/停止进程,还是建议在dreamview上操作,如果dreamview出异常但还需要调试,可以在supervisor命令行中操作。