欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

FANTrack代码阅读笔记

程序员文章站 2022-04-16 16:24:27
...

1.FANTrack系统结构图

FANTrack代码阅读笔记

用于相似性学习的Siamese网络:
FANTrack代码阅读笔记
bounding box分支:
FANTrack代码阅读笔记
appear 分支
FANTrack代码阅读笔记
关联网络
FANTrack代码阅读笔记

相似图的计算(用于得到最后的追踪结果,训练过程种不需要这一步)

FANTrack代码阅读笔记

2.FANTrack我理解的框架图

与mmMOT不同的是,FANTrack先分别得到Bbox相似性矩阵和表观相似性矩阵,并基于Bbox和appear特征分别计算它们的权重,再进行加权融合得到最终的相似度矩阵。而在mmMOT中,是先融合特征,再使用融合后的特征计算相似性矩阵。

FANTrack中也加入了对遮挡状态的预测,其实就是加进来了一个网络模型,由几个全连接层组成,用来预测遮挡状态,按文中的意思是可以提高性能的,我也可以尝试加入这个小的预测模型试一下。

FANTrack代码阅读笔记

3.FANTrack代码结构

这个代码没有像mmMOT一样模块化做得很好,把相同类型的全部放在了一个文件中,简单的对实现和框架构建做了分离,感觉添加模块可能会比较困难

FANTrack代码阅读笔记

4.代码解读

Tracker.py

完整的追踪流程,包括Kalman过滤器的预测、数据关联(特征计算、相似度计算、轨迹关联)、轨迹管理(新增、更新、删除)

追踪算法如下:

FANTrack代码阅读笔记

有三个类,Track、TrackedObject、Tracker,分别代表轨迹、以追踪的目标、追踪器

  • class Track:Kalman过滤器相关、轨迹相关
    • init:初始化,track_id、current_target、age、type等轨迹相关参数,F/H/P/Q/R等Kalman过滤器相关参数
    • 矩阵获取函数:get_P_matrix、get_R_matrix、get_Q_matrix
    • 轨迹更新:update()
    • 轨迹预测:predict()
  • class TrackedObject:已追踪成功的目标相关
    • init:初始化,track_id、occluded、truncated等
    • 写文件函数
    • 标签更新函数
  • def build_global_maps():根据simNet的输出结果构建全局/局部相似图
    • 在train.py和run simnet中调用了
  • class Tracker:追踪过程
    • init
      • 数据准备
      • 目标检测模型初始化:支持AVOD或RPN
      • 构建SimNet模型
      • 构建AssocNet模型
    • begin_tracking():主体的追踪流程
      • 计算start_frame和end_frame用于逐帧计算
      • 逐帧计算
        • 第1帧:为每个detection初始化新的轨迹,create_new_track()
        • 非第1帧
          • 预测:self.predict_tracks() (其实没有太明白predict是用来做什么的)
          • 关联:associations=self.get_associations(tragets,measurements)
          • 更新:self.update_tracks(tragets,measurements,associations)
        • 后处理:track.age+=1、结果写文档、可视化等
    • get_associations():计算关联关系,包括目标识别、计算相似度、数据关联三个大的过程
      • 坐标系转换
      • 相似图计算
        • 计算measurement_locations、target_centers等,用于构建全局、局部相似图
        • target和detections的Bbox向量和表观特征特征的获取
        • run simNet,根据原始表观特征和Bbox特征得到计算相似性的feature和weights;在这里也有用于消融实验的代码,包括使用马氏距离、不使用AVOD特征、不使用Bbox特征三种消融
        • 构建全局/局部相似图
      • 关联关系计算
        • run assocNet,得到m_pred_x,m_pred_y
        • 得到关联结果:associations、corr_scores
        • remap得到最终结果
    • 轨迹相关函数
      • create_new_track()
      • destory_track()
      • predict_tracks()
      • update_track(),根据关联结果完成轨迹的更新、创建和删除
    • 数据准备相关函数
      • get_meaurements()
      • get_gt_detections()
      • get_object_attribute()
      • get_predictions_from_avod()
    • 其它函数
      • check_none_associations() 处理measurements或targets为空的情况
      • get_associations() 根据AssocNet的预测结果得到关联结果
      • prepare_simnet_inputs() 根据measurements和tragets准备simNet的输入

Train.py

有一个类Trainer,用于训练不同模型

  • train_simnet()
    • 初始化:设置保存、变量初始化
    • 逐epoch训练
      • 训练
      • checkpoint保存
      • 验证:validation,损失
  • train_assocnet()
    • 初始化:设置保存、变量初始化
    • 逐epoch训练
      • 训练
      • checkpoint保存
      • 验证:validation,损失
  • run_simnet():运行simnet并保存输出(局部相似度图),作为AssocNet的输入
    • 调用create_similarity_maps()
  • create_similarity_maps()
    • 运行simnet得到特征、权重,target_center等
    • 构建local_corr_map(tracker.py->build_global_maps)
    • 得到输出
  • associations_results():根据预测值m_pred_x、m_pred_y得到最终关联结果
    • 涉及到全局坐标和局部坐标的转换
  • train_combine_network():两个网络联合训练
    • 初始化:设置保存、变量初始化
    • 逐epoch训练
      • 训练
      • checkpoint保存
      • 验证:validation,损失

model.py

关联模型,有一个类AssocModel

  • init():初始化,包括Bbox shape、appearance shape、center shape、correlation shape、locations等

  • build_simnet():simnet网络的搭建

    • 一些变量的初始化,包括t-1 targets的Bbox和表观向量、t目标的Bbox和表观向量、labels、tragets numbers等

      self.in_bbox_tar_ph = tf.placeholder(tf.float32, self.pos_shp_shape, 'in_bbox_tar')
      self.in_bbox_meas_ph = tf.placeholder(tf.float32, self.pos_shp_shape, 'in_bbox_meas')
      self.in_appear_tar_ph = tf.placeholder(tf.float32, self.appear_shape, 'in_appear_tar')
      self.in_appear_meas_ph = tf.placeholder(tf.float32, self.appear_shape, 'in_appear_meas')
      self.simnet_labels_ph = tf.placeholder(tf.float32, self.lab_shape, 'labels')
      self.training_ph = tf.placeholder(tf.bool, name = 'training')
      self.num_target_ph = tf.placeholder(tf.int32,shape=[1], name = 'num_targets')
      self.global_step = tf.placeholder(tf.int32, name = 'global_step')
      
      self.in_bbox_ph = tf.concat([self.in_bbox_tar_ph,self.in_bbox_meas_ph], axis=0)
      self.in_appear_ph = tf.concat([self.in_appear_tar_ph,self.in_appear_meas_ph], axis=0)
      self.num_targets = self.num_target_ph[0]
      
    • Bbox分支
      FANTrack代码阅读笔记

'''bbox branch'''

# cross channel pooling of 3d bbox params
self.conv1_0_act = slim.conv2d(self.in_bbox_ph, 256, 1, scope = 'conv1_0')

# flattens the bbox maps into 1D for each batch
net1 = slim.flatten(self.conv1_0_act,  scope = 'feat_1d_1')

# fc layers with dropout for bbox features
net1 = slim.dropout(net1, keep_prob=0.5, scope = 'dropout1')
self.fc1_act = slim.fully_connected(net1, 512, scope='fc1')

net1 = slim.dropout(self.fc1_act, keep_prob=0.5, scope = 'dropout2')
self.fc2_act = slim.fully_connected(net1, 512,  scope='fc2')

# remove the feature distribution change before normalization
features1 = slim.batch_norm(self.fc2_act,scope='ball1')

# project thself.training_phe features to unit hypersphere
self.unit_features1 = tf.nn.l2_normalize(features1, axis=1)
  • appearance分支(用的自己写的网络,没有用现成的网络)

FANTrack代码阅读笔记

# fc layer implemented by a conv layer for avod's features
    self.conv2_0_act = slim.conv2d(self.in_appear_ph, 256, 3, scope = 'conv2_0')
    
    ''' To do: The fc layer is implemented using 7x7 kernel and valid padding'''
    
    # Global Average Pooling
    # self.conv2_1_act = tf.layers.average_pooling2d(inputs=self.conv2_0_act, pool_size=7, strides=7)
    self.conv2_1_act = tf.nn.avg_pool(self.conv2_0_act,ksize=[1, 7, 7, 1], strides=[1, 1, 1, 1], padding='VALID')
    
    # fc layer implemented by a conv layer for avod's features
    # self.conv2_1_act = slim.conv2d(self.conv2_0_act, 256, 1, scope = 'conv2_1')
    
    # flattens the bbox maps into 1D for each batch
    net2 = slim.flatten(self.conv2_1_act,  scope = 'feat_1d_2')
    
    # fc layers with dropout for bbox features
    net2 = slim.dropout(net2, keep_prob=0.5, scope = 'dropout3')
    self.fc3_act = slim.fully_connected(net2, 512, scope='fc3')
    
    net2 = slim.dropout(self.fc3_act, keep_prob=0.5, scope = 'dropout4')
    self.fc4_act = slim.fully_connected(net2, 512, scope='fc4')
    
    # remove the feature distribution change before normalization
    features2 = slim.batch_norm(self.fc4_act,scope='ball2')
    
    # project the features to unit hypersphere
    self.unit_features2 = tf.nn.l2_normalize(features2, axis=1)
  • 特征分离,将target和measurement拼接在一起的Bbox特征和表观特征分离,便于之后计算相似性

    # slice bbox feature tensor into targets and measurements
    self.t_feat1 = tf.slice(self.unit_features1,[0,0],[self.num_targets,-1], 't_feat1')
    
    self.m_feat1 = tf.slice(self.unit_features1,[self.num_targets,0],[-1,-1], 'm_feat1')
    
    # slice appearance feature tensor into targets and measurements
    self.t_feat2 = tf.slice(self.unit_features2,[0,0],[self.num_targets,-1], 't_feat2')
    
    self.m_feat2 = tf.slice(self.unit_features2,[self.num_targets,0],[-1,-1], 'm_feat2')
    
  • Bbox相似性和表观相似性权重的计算

    FANTrack代码阅读笔记

    # compute the weights of bbox and appearance similarities
    concat_feat = tf.concat((self.unit_features1,self.unit_features2), axis=1)
    
    logits = slim.fully_connected(concat_feat, 2, scope='logits')
    
    branch_weights = tf.nn.softmax(logits, axis=1, name='branch_weights')
    
    # slice normalized weights into bbox and appearance weights for targets and measurements
    self.w_bbox_t = tf.slice(branch_weights,[0,0],[self.num_targets,1], 'w_bbox_t')
    
    self.w_appear_t = tf.slice(branch_weights,[0,1],[self.num_targets,1], 'w_appear_t')
    
    self.w_bbox_m = tf.slice(branch_weights,[self.num_targets,0],[-1,1], 'w_bbox_m')
    
    self.w_appear_m = tf.slice(branch_weights,[self.num_targets,1],[-1,1], 'w_appear_m')
    
  • 定义loss、optimizer、predictor

    if self.training_type == TrainingType.Simnet or self.training_type == TrainingType.Both:
        self.build_simnet_loss_function()
    
  • build_simnet_loss_function()

    • 分别计算表观特征和Bbox特征的cosine相似性,得到相似性矩阵之后就可以计算simNet的损失,之后进行反向传播。全局/局部相似图只需要在真正追踪的时候计算,不需要参与到模型训练的过程当中

      # compute the cosine similarities
      self.cosine_sim1 = tf.reduce_sum( tf.multiply(self.t_feat1, self.m_feat1), 1, keepdims=True)
      
      self.cosine_sim2 = tf.reduce_sum( tf.multiply(self.t_feat2, self.m_feat2), 1, keepdims=True)
      
    • 计算Bbox和appear相似性权重,w_bbox=w_bbox_t*w_bbox_m、w_appear=w_appear_t*w_appear_m,再归一化得到对应权重

      # compute weights for cosine similarities
      self.w_bbox = tf.multiply(self.w_bbox_t, self.w_bbox_m, 'weight_bbox')
      
      self.w_appear = tf.multiply(self.w_appear_t, self.w_appear_m, 'weight_appear')
      
      # normalize weights for each target-measurement pair
      w_norm_const = self.w_bbox + self.w_appear
      
      self.w_bbox = self.w_bbox / w_norm_const
      
      self.w_appear = self.w_appear / w_norm_const
      
    • 决策融合,最终相似性的计算,权重*相似性

      # decision fusion
      self.cosine_sim =  tf.add(self.w_bbox*self.cosine_sim1,self.w_appear*self.cosine_sim2,'cosine_sim')
      
      # clipcosine_sim to a value in [-1,1]
      self.cosine_sim = tf.clip_by_value(self.cosine_sim, -1.0, 1.0)
      
    • 计算precision和recall,计算loss并优化,损失函数是weighted cosine distance

    FANTrack代码阅读笔记

FANTrack代码阅读笔记

  • build_assoc_model()

    • 变量:局部关联图、目标数量、label、平均损失、平均准确率、epoch no

    • 框架

      FANTrack代码阅读笔记

    • 逻辑图

      FANTrack代码阅读笔记

      #初始输入为simNet计算得到的局部相似图
      # input node for the first dilated convolutional layer
      net = self.in_local_corr_map_ph
      
      #空洞卷积
      for l in range(0, num_layers):
      
      	# the dilation rate for convolutional layers
          if (l < 2):
          	dilation_rate = 2**(l+1)
          else:
              dilation_rate = 6
      
          net = slim.conv2d(net,num_kernels,kernel_size,rate=dilation_rate, padding='SAME',
                          normalizer_params={'is_training': self.in_training_assocnet_ph,
                                             'scope': 'bn_{}'.format(str(l))
                                             },
                          scope='conv_{}'.format(str(l)))
      
          # save activations to summarize their histogram
          hist_activations[l] = net
          
      #省略了mask的计算代码
          # use this to make logits for irrelevant pixels -inf
      self.masks = tf.to_float(np.finfo(np.float64).min*self.masks)
      
      logits_maps = slim.conv2d(net, self.channels, kernel_size, scope='logits')
      
      # make the network predict one of the measurements if available for each targets.
      logits_maps = tf.add(logits_maps, self.masks, 'masked_logits')
      
    • 遮挡状态预测

    FANTrack代码阅读笔记

    # compute the logits for undetection probabilities of each targets
    net= slim.flatten(logits_maps, scope='flat_logit_maps')
    
    net = slim.fully_connected(net, 512, scope='fc1')
    
    net = slim.fully_connected(net, 512, scope='fc2')
    
    logits_undetect = slim.fully_connected(net, self.channels, scope='fc3')
    
    • 计算关联结果

      FANTrack代码阅读笔记

      # flat out maps for each target
      logits_maps_1d = tf.reshape(logits_maps,[1,GlobalConfig.CROP_SIZE**2,-1])
      
      # flat out undetection logits
      logits_undetect_1d = tf.reshape(logits_undetect,[1, 1, -1])
      
      # concatenate logits for maps and pixels for undetections
      logits_map_undetect = tf.concat([logits_maps_1d,logits_undetect_1d], axis=1)
      
      # perform softmax op on each map to compute association prob. and 1 - detection probability
      maps_undetect_1d =tf.nn.softmax(logits_map_undetect, axis=1)
      
      # obtain real maps for each targets
      maps_1d = tf.slice(maps_undetect_1d, [0,0,0], [1, GlobalConfig.CROP_SIZE**2, -1])
      
      # obtain detection probability for each targets
      self.Pd = 1 - tf.slice(maps_undetect_1d, [0, GlobalConfig.CROP_SIZE**2, 0], [1,-1,-1])
      
      # 2D predicted output maps
      self.maps = tf.reshape(maps_1d,tf.shape(logits_maps))
      
    • 后处理:直方图统计、无关信息的移除(设置固定数量的轨迹解决轨迹动态变化的问题,在这里把那些dummy轨迹移除)

    • precision和recall

    • 损失函数,assoc损失是交叉熵损失

    • 后处理:直方图统计、无关信息的移除(设置固定数量的轨迹解决轨迹动态变化的问题,在这里把那些dummy轨迹移除)

    • precision和recall

    • 损失函数,assoc损失是交叉熵损失
      FANTrack代码阅读笔记

相关标签: python 深度学习