AlexNet论文笔记

程序员文章站 2024-03-15 09:21:53

...

AlexNet_v1：ImageNet Classification with Deep Convolutional Neural Networks

AlexNet在2012年的ILVSRC上获得了第一名。相比第二名，它的准确率提高了超过了10%。
AlexNet的创新点：
1.数据增强。

对图像进行了随机裁剪，水平翻转。数据增强使得数据增加了(256-224)x(256-224)x2=2048倍。并且改变图片的RGB通道的强度。

2.ReLU**函数

**函数采用了ReLU，这样就不会出现tanh和sigmoid两端的饱和现象。能够缓梯度反向传播过程中的梯度消失问题。

3.重叠池化

将正常池化(2x2 stride 2)改为重叠池化(3x3 stride 2)可以将top-1和top-5提高0.4%和0.3%

4.局部响应归一化(Local Response Normalization)

LRN有助于泛化，但是LRN的效果其实是有争议的。

LRN其实是和BatchNorm有点像。

LRN思想主要来自于生物学的“侧抑制”

5.Dropout

在fc中应用dropout可以防止过拟合，它以一种很高效地方式结合多个不同的训练模型。

6.GPU实现

AlexNet的GPU实现，大大加快了模型的训练。

下图是AlexNet的架构图：
AlexNet论文笔记
下图是ZFNet给出的AlexNet架构图（个人感觉这个更清晰）：

AlexNet一共8层，5个conv层，3个fc层。
以下是TF+Keras混合编写的AlexNet：

# 这里使用TF和TF.keras.layers实现，可以保证绝对的兼容性
# 其实TF内部tf.layers是一个和keras.layers类似的高层API，也很好用
import tensorflow as tf
keras = tf.keras
from tensorflow.python.keras.layers import Conv2D,MaxPool2D,Dropout,Flatten,Dense


def inference(inputs,
              num_classes=1000,
              is_training=True,
              dropout_keep_prob=0.5):
  '''
  Inference

  inputs: a tensor of images
  num_classes: the num of category.
  is_training: set ture when it used for training
  dropout_keep_prob: the rate of dropout during training
  '''

  x = inputs
  # conv1
  x = Conv2D(96, [11,11], 4, activation='relu', name='conv1')(x)
  # lrn1
  x = tf.nn.local_response_normalization(x, name='lrn1')
  # pool1
  x = MaxPool2D([3,3], 2, name='pool1')(x)
  # conv2
  x = Conv2D(256, [5,5], activation='relu', padding='same', name='conv2')(x)
  # lrn2
  x = tf.nn.local_response_normalization(x, name='lrn2')
  # pool2
  x = MaxPool2D([3,3], 2, name='pool2')(x)
  # conv3
  x = Conv2D(384, [3,3], activation='relu', padding='same', name='conv3')(x)
  # conv4
  x = Conv2D(384, [3,3], activation='relu', padding='same', name='conv4')(x)
  # conv5
  x = Conv2D(256, [3,3], activation='relu', padding='same', name='conv5')(x)
  # pool5
  x = MaxPool2D([3,3], 2, name='pool5')(x)
  # flatten
  x = Flatten(name='flatten')(x)
  # dropout
  if is_training:
    x = Dropout(dropout_keep_prob, name='dropout5')(x)
  # fc6
  x = Dense(4096, activation='relu', name='fc6')(x)
  # dropout
  if is_training:
    x = Dropout(dropout_keep_prob, name='dropout6')(x)
  # fc7
  x = Dense(4096, activation='relu', name='fc7')(x)
  # fc8
  logits = Dense(num_classes, name='logit')(x)
  return logits


def build_cost(logits, labels,weight_decay_rate):
  '''
  cost

  logits: predictions
  labels: true labels
  weight_decay_rate: weight_decay_rate
  '''
  with tf.variable_scope('costs'):
    with tf.variable_scope('xent'):
      xent = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(
          logits=logits, labels=labels))
    with tf.variable_scope('decay'):
      costs=[]
      for var in tf.trainable_variables():
        costs.append(tf.nn.l2_loss(var))
        tf.summary.histogram(var.op.name, var) # summary
      cost_decay = tf.multiply(weight_decay_rate, tf.add_n(costs))
    cost = tf.add(xent,cost_decay)
    tf.summary.scalar('cost', cost) # summary
  return cost


def build_train_op(cost, lrn_rate, global_step):
  '''
  train_op

  cost: cost
  lrn_rate: learning rate
  global_step: global step
  '''
  with tf.variable_scope('train'):
    lrn_rate = tf.constant(lrn_rate, tf.float32)
    tf.summary.scalar('learning_rate', lrn_rate) # summary

    trainable_variables = tf.trainable_variables()
    grads = tf.gradients(cost, trainable_variables)

    optimizer = tf.train.AdamOptimizer(lrn_rate)

    apply_op = optimizer.apply_gradients(
        zip(grads, trainable_variables),
        global_step=global_step, name='train_step')

    train_op = apply_op
  return train_op


if __name__ == '__main__':
  images = tf.placeholder(tf.float32, [None, 224, 224, 3])
  labels = tf.placeholder(tf.float32, [None, 1000])
  logits = inference(inputs=images,
                     num_classes=1000)
  print('inference: good job')
  cost = build_cost(logits=logits,
                    labels=labels,
                    weight_decay_rate=0.0002)
  print('build_cost: good job')
  global_step = tf.train.get_or_create_global_step()
  train_op = build_train_op(cost=cost,
                            lrn_rate=0.001,
                            global_step=global_step)
  print('build_train_op: good job')

在这里，我们提供inference、build_cost、build_train_op三个函数，这三部分基本上实现上AlexNet的绝大部分。
AlexNet的具体配置：

层	配置
conv1	aaa@qq.com stride 4, relu
lrn1
pool1	3x3 maxpool, stride 2
conv2	aaa@qq.com stride 1, relu
lrn2
pool2	3x3 maxpool, stride 2
conv3	aaa@qq.com stride 1, relu
conv4	aaa@qq.com stride 1, relu
conv5	aaa@qq.com stride 1, relu
pool5	3x3 maxpool, stride 2
flatten
dropout rate	0.5
fc6	4096
dropout rate	0.5
fc7	4096
fc8	1000

注意：使用本博客的代码，请添加引用

上一篇： Alexnet论文笔记

下一篇： leetcode 237. 删除链表中的节点 JavaScript解决

AlexNet论文笔记

注意：使用本博客的代码，请添加引用

VGG论文笔记

经典卷积网络之DenseNet的论文解读及代码实现

Alexnet论文笔记

论文笔记之VGG

Maxout论文笔记

tensorflow深度学习实战笔记（一）：使用tensorflow slim自带的模型训练自己的数据

AlexNet论文笔记

Effective Java 第三版读书笔记——条款 22：接口应该只被用于定义类型

算法笔记--统计字符串中每个数字的个数

HTML&CSS基础学习笔记1.12—引入样式表博客分类： HTML html引入样式表csshreflink

AlexNet论文笔记

注意：使用本博客的代码，请添加引用

VGG论文笔记

经典卷积网络之DenseNet的论文解读及代码实现

Alexnet论文笔记

论文笔记之VGG

Maxout论文笔记

tensorflow深度学习实战笔记（一）：使用tensorflow slim自带的模型训练自己的数据

AlexNet论文笔记

Effective Java 第三版读书笔记——条款 22：接口应该只被用于定义类型

算法笔记--统计字符串中每个数字的个数

HTML&CSS基础学习笔记1.12—引入样式表 博客分类： HTML html引入样式表csshreflink

HTML&CSS基础学习笔记1.12—引入样式表博客分类： HTML html引入样式表csshreflink