Densenet(Algorithm+Code)
论文链接:Densely Connected Convolutional Networks
代码github:code
1.网络结构
该网络参考了ResNet(解决了网络深的时候梯度消失问题)、GoogleNet的inception(网络宽的问题)。如图所示,为稠密链接,每层以前一层的输出为输入,对于L层的网络,一共有L个链接,对于densenet,则有
加深网络结构首先需要解决的是梯度消失问题,解决方案是:尽量缩短前层和后层之间的连接。比如上图中,H4层可以直接用到原始输入信息X0,同时还用到了之前层对X0处理后的信息,这样能够最大化信息的流动。反向传播过程中,X0的梯度信息包含了损失函数直接对X0的导数,有利于梯度传播。
DenseNet的几个优点:
1、减轻了vanishing-gradient(梯度消失)
2、加强了feature的传递
3、更有效地利用了feature
4、一定程度上较少了参数数量
2.Dense Block 结构
如下图所示,每层实现了一组非线性变换
对于ResNet:
同时,由于identity function 和 H的输出通过相加的方式结合,会妨碍信息在整个网络的传播。受GooLeNet的启发,DenseNet通过串联的方式结合:
这里
由于串联操作要求特征图
每个DenseBlock的之间层称为transition layers,由BN−>Conv(1×1)−>averagePooling(2×2)组成。
Growth rate:由于每个层的输入是所有之前层输出的连接,因此每个层的输出不需要像传统网络一样多。这里Hl(.)的输出的特征图的数量都为k,k即为Growth Rate,用来控制网络的“宽度”(特征图的通道数).比如说第l层有
虽然说每个层只产生k个输出,但是后面层的输入依然会很多,因此引入了Bottleneck layers 。本质上是引入1x1的卷积层来减少输入的数量,
BN−>ReLU−>Conv(1×1)−>BN−>ReLU−>Conv(3×3)
文中将带有Bottleneck layers的网络结构称为DenseNet-B。
除了在DenseBlock内部减少特征图的数量,还可以在transition layers中来进一步Compression。如果一个DenseNet有m个特征图的输出,则transition layer产生 ⌊θm⌋个输出,其中0<θ≤1。对于含有该操作的网络结构称为DenseNet-C。
同时包含Bottleneck layer和Compression的网络结构为DenseNet-BC。
具体的网络结构:
3.代码分析
3.1 Densenet
网络框架:
cov1→cov2→block1→transition1→block2→transition2→block3→transition3→block4→block3_up →bn_relu_conv→block2_up→bn_relu_conv→block1_up→bn_relu_conv→ unsample1→bn_relu_conv3→bn_sigmoid_conv
def dense_net(image,img_name_index, is_training=True):
with tf.variable_scope('conv1') as scope:
l = conv2d(image, 3, 16, 3, 1)
l = batch_norm_layer(l, is_training)
l = tf.nn.relu(l)
with tf.variable_scope('conv2') as scope:
l = conv2d(l, 16, 32, 3, 2)
l = batch_norm_layer(l, is_training)
l_fisrt_down = tf.nn.relu(l)
l = tf.nn.max_pool(l_fisrt_down, [1, 3, 3, 1], strides=[1, 2, 2, 1], padding='SAME')
with tf.variable_scope('block1') as scope:
l = conv2d(l, 32, growth_rate, 3, 1)
for i in range(dense_block1_num):
l = add_layer('dense_layer.{}'.format(i), l, is_training, input_filters1=growth_rate * (i + 1))
block1, l = add_transition_average('transition1', l, is_training,
input_filters=growth_rate * (dense_block1_num + 1),
output_filters=32)
with tf.variable_scope('block2') as scope:
l = conv2d(l, 32, growth_rate, 3, 1)
for i in range(dense_block2_num):
l = add_layer('dense_layer.{}'.format(i), l, is_training, input_filters1=growth_rate * (i + 1))
block2, l = add_transition_average('transition2', l, is_training,
input_filters=growth_rate * (1 + dense_block2_num),
output_filters=32)
with tf.variable_scope('block3') as scope:
l = conv2d(l, 32, growth_rate, 3, 1)
for i in range(dense_block3_num):
l = add_layer('dense_layer.{}'.format(i), l, is_training, input_filters1=growth_rate * (i + 1))
block3, l = add_transition_average('transition3', l, is_training,
input_filters=growth_rate * (1 + dense_block3_num),
output_filters=32)
with tf.variable_scope('block4') as scope:
l = conv2d(l, 32, growth_rate, 3, 1)
for i in range(dense_block4_num):
l = add_layer('dense_layer.{}'.format(i), l, is_training, input_filters1=growth_rate * (i + 1))
# l = add_transition_average('transition4', block4, is_training, input_filters=growth_rate * dense_block4_num,
# output_filters=32)
with tf.variable_scope('block3_up') as scope:
l = bn_relu_conv(l, is_training, growth_rate * (1 + dense_block4_num), 32, 3, 1, name='bn_relu_conv1')
l = upsample(l, 32, 32, 3, 2)
l = tf.concat([l, block3], 3)
l = bn_relu_conv(l, is_training, 64, growth_rate, 3, 1, name='bn_relu_conv2')
for i in range(dense_block3_num):
l = add_layer('dense_layer.{}'.format(i), l, is_training, input_filters1=growth_rate * (i + 1))
with tf.variable_scope('block2_up') as scope:
l = bn_relu_conv(l, is_training, growth_rate * (1 + dense_block3_num), 32, 3, 1, name='bn_relu_conv1')
l = upsample(l, 32, 32, 3, 2)
l = tf.concat([l, block2], 3)
l = bn_relu_conv(l, is_training, 64, growth_rate, 3, 1, name='bn_relu_conv2')
for i in range(dense_block2_num):
l = add_layer('dense_layer.{}'.format(i), l, is_training, input_filters1=growth_rate * (i + 1))
with tf.variable_scope('block1_up') as scope:
l = bn_relu_conv(l, is_training, growth_rate * (1 + dense_block2_num), 32, 3, 1, name='bn_relu_conv1')
l = upsample(l, 32, 32, 3, 2)
l = tf.concat([l, block1], 3)
l = bn_relu_conv(l, is_training, 64, growth_rate, 3, 1, name='bn_relu_conv2')
for i in range(dense_block1_num):
l = add_layer('dense_layer.{}'.format(i), l, is_training, input_filters1=growth_rate * (i + 1))
l = bn_relu_conv(l, is_training, growth_rate * (1 + dense_block1_num), 64, 3, 1, name='bn_relu_conv1')
with tf.variable_scope('upsample1') as scope:
l = upsample(l, 64, 64, 3, 2)
l = bn_relu_conv(l, is_training, 64, 32, 3, 1)
l = tf.concat([l, l_fisrt_down], 3)
l = bn_relu_conv(l, is_training, 64, 64, 3, 1, name='bn_relu_conv2')
with tf.variable_scope('bn_relu_conv3') as scope:
l = upsample(l, 64, 64, 3, 2)
l = bn_relu_conv(l, is_training, 64, 16, 3, 1)
with tf.variable_scope('bn_sigmoid_conv') as scope:
l = bn_relu_conv(l, is_training, 16, 1, 1, 1)
image_conv = tf.nn.sigmoid(l)
saver = tf.train.Saver()
if ckpt and ckpt.model_checkpoint_path:
if img_name_index==0:
saver.restore(sess, ckpt.model_checkpoint_path)
print('model restored')
return image_conv
3.2 reuse的使用
reuse是设置代码是否允许重复
第一次设置reuse为false,原因是还没有开始测试过,网络没有记忆,初始化是不允许有相同的网络结构。
第二次设置为true,原因是测试过一次之后,后续测试代码需要循环跑,允许出现重复的网络结构。
if img_name_index==0:
output = inference(img_input,img_name_index,is_training=False,scope_reuse=False)
else:
output = inference(img_input,img_name_index,is_training=False,scope_reuse=True)
3.3 测试图片时,制作反转测试集方法
preds_x = np.squeeze(preds_all[0])
preds_x90 = cv2.warpAffine(np.squeeze(preds_all[1],axis=0),M270,(224,224))
preds_x180 = cv2.warpAffine(np.squeeze(preds_all[2],axis=0),M180,(224,224))
preds_x270 = cv2.warpAffine(np.squeeze(preds_all[3],axis=0), M90, (224, 224))
preds_xup_down = np.squeeze(preds_all[4][::-1,:,:])
preds_xleft_right = np.squeeze(preds_all[5][:,::-1,:])
preds=(preds_x+preds_x90+preds_x180+preds_x270+preds_xup_down+preds_xleft_right)
3.4 tips
* 输入神经网络的参数都是四维:batch、高度、宽度、feature map
* 如何deconv :(即如何使得包含深度特征的图片变大)
对小图片进行补0,再用卷积核卷积变成大图片,但是这样容易造成信息冗余与丢失。采用双线性插值的办法,如下图:
上一篇: jQuery的CheckBox全选反选时,勾选失效的问题
下一篇: 坦白自己的缺点
推荐阅读
-
Tensorflow复现DenseNet cifar-10(正确率91%)
-
AI实战pytorch版DenseNet迁移学习
-
常用神经网络_2_ 残差网络 —> ResNet —> DenseNet
-
caffe跑densenet的错误:Message type "caffe.PoolingParameter" has no field named "ceil_mode".
-
DenseNet论文总结
-
Keras、DenseNet下的数据修改、模型增强、优化器、损失函数等模型调优,对模型最终精度的影响。包括对GPU问题。,占用、GPU利用率等的研究探讨(模型调优、策略方式等的记录)
-
DenseNet密集卷积网络详解(附代码实现)
-
论文笔记之DenseNet
-
Densenet(Algorithm+Code)
-
基于tf-slim框架的DenseNet实现