欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

AlexNet论文笔记

程序员文章站 2024-03-15 09:21:53
...

AlexNet_v1:ImageNet Classification with Deep Convolutional Neural Networks

AlexNet在2012年的ILVSRC上获得了第一名。相比第二名,它的准确率提高了超过了10%。
AlexNet的创新点
1.数据增强。

对图像进行了随机裁剪,水平翻转。数据增强使得数据增加了(256-224)x(256-224)x2=2048倍。并且改变图片的RGB通道的强度。

2.ReLU**函数

**函数采用了ReLU,这样就不会出现tanh和sigmoid两端的饱和现象。能够缓梯度反向传播过程中的梯度消失问题。

3.重叠池化

将正常池化(2x2 stride 2)改为重叠池化(3x3 stride 2)可以将top-1和top-5提高0.4%和0.3%

4.局部响应归一化(Local Response Normalization)

LRN有助于泛化,但是LRN的效果其实是有争议的。

LRN其实是和BatchNorm有点像。

LRN思想主要来自于生物学的“侧抑制”

5.Dropout

在fc中应用dropout可以防止过拟合,它以一种很高效地方式结合多个不同的训练模型。

6.GPU实现

AlexNet的GPU实现,大大加快了模型的训练。

下图是AlexNet的架构图:
AlexNet论文笔记
下图是ZFNet给出的AlexNet架构图(个人感觉这个更清晰):
AlexNet论文笔记
AlexNet一共8层,5个conv层,3个fc层。
以下是TF+Keras混合编写的AlexNet:

# 这里使用TF和TF.keras.layers实现,可以保证绝对的兼容性
# 其实TF内部tf.layers是一个和keras.layers类似的高层API,也很好用
import tensorflow as tf
keras = tf.keras
from tensorflow.python.keras.layers import Conv2D,MaxPool2D,Dropout,Flatten,Dense


def inference(inputs,
              num_classes=1000,
              is_training=True,
              dropout_keep_prob=0.5):
  '''
  Inference

  inputs: a tensor of images
  num_classes: the num of category.
  is_training: set ture when it used for training
  dropout_keep_prob: the rate of dropout during training
  '''

  x = inputs
  # conv1
  x = Conv2D(96, [11,11], 4, activation='relu', name='conv1')(x)
  # lrn1
  x = tf.nn.local_response_normalization(x, name='lrn1')
  # pool1
  x = MaxPool2D([3,3], 2, name='pool1')(x)
  # conv2
  x = Conv2D(256, [5,5], activation='relu', padding='same', name='conv2')(x)
  # lrn2
  x = tf.nn.local_response_normalization(x, name='lrn2')
  # pool2
  x = MaxPool2D([3,3], 2, name='pool2')(x)
  # conv3
  x = Conv2D(384, [3,3], activation='relu', padding='same', name='conv3')(x)
  # conv4
  x = Conv2D(384, [3,3], activation='relu', padding='same', name='conv4')(x)
  # conv5
  x = Conv2D(256, [3,3], activation='relu', padding='same', name='conv5')(x)
  # pool5
  x = MaxPool2D([3,3], 2, name='pool5')(x)
  # flatten
  x = Flatten(name='flatten')(x)
  # dropout
  if is_training:
    x = Dropout(dropout_keep_prob, name='dropout5')(x)
  # fc6
  x = Dense(4096, activation='relu', name='fc6')(x)
  # dropout
  if is_training:
    x = Dropout(dropout_keep_prob, name='dropout6')(x)
  # fc7
  x = Dense(4096, activation='relu', name='fc7')(x)
  # fc8
  logits = Dense(num_classes, name='logit')(x)
  return logits


def build_cost(logits, labels,weight_decay_rate):
  '''
  cost

  logits: predictions
  labels: true labels
  weight_decay_rate: weight_decay_rate
  '''
  with tf.variable_scope('costs'):
    with tf.variable_scope('xent'):
      xent = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(
          logits=logits, labels=labels))
    with tf.variable_scope('decay'):
      costs=[]
      for var in tf.trainable_variables():
        costs.append(tf.nn.l2_loss(var))
        tf.summary.histogram(var.op.name, var) # summary
      cost_decay = tf.multiply(weight_decay_rate, tf.add_n(costs))
    cost = tf.add(xent,cost_decay)
    tf.summary.scalar('cost', cost) # summary
  return cost


def build_train_op(cost, lrn_rate, global_step):
  '''
  train_op

  cost: cost
  lrn_rate: learning rate
  global_step: global step
  '''
  with tf.variable_scope('train'):
    lrn_rate = tf.constant(lrn_rate, tf.float32)
    tf.summary.scalar('learning_rate', lrn_rate) # summary

    trainable_variables = tf.trainable_variables()
    grads = tf.gradients(cost, trainable_variables)

    optimizer = tf.train.AdamOptimizer(lrn_rate)

    apply_op = optimizer.apply_gradients(
        zip(grads, trainable_variables),
        global_step=global_step, name='train_step')

    train_op = apply_op
  return train_op


if __name__ == '__main__':
  images = tf.placeholder(tf.float32, [None, 224, 224, 3])
  labels = tf.placeholder(tf.float32, [None, 1000])
  logits = inference(inputs=images,
                     num_classes=1000)
  print('inference: good job')
  cost = build_cost(logits=logits,
                    labels=labels,
                    weight_decay_rate=0.0002)
  print('build_cost: good job')
  global_step = tf.train.get_or_create_global_step()
  train_op = build_train_op(cost=cost,
                            lrn_rate=0.001,
                            global_step=global_step)
  print('build_train_op: good job')

在这里,我们提供inference、build_cost、build_train_op三个函数,这三部分基本上实现上AlexNet的绝大部分。
AlexNet的具体配置:

配置
conv1 aaa@qq.com stride 4, relu
lrn1
pool1 3x3 maxpool, stride 2
conv2 aaa@qq.com stride 1, relu
lrn2
pool2 3x3 maxpool, stride 2
conv3 aaa@qq.com stride 1, relu
conv4 aaa@qq.com stride 1, relu
conv5 aaa@qq.com stride 1, relu
pool5 3x3 maxpool, stride 2
flatten
dropout rate 0.5
fc6 4096
dropout rate 0.5
fc7 4096
fc8 1000

注意:使用本博客的代码,请添加引用