AlexNet论文笔记
程序员文章站
2024-03-15 09:21:53
...
AlexNet_v1:ImageNet Classification with Deep Convolutional Neural Networks
AlexNet在2012年的ILVSRC上获得了第一名。相比第二名,它的准确率提高了超过了10%。
AlexNet的创新点:
1.数据增强。
对图像进行了随机裁剪,水平翻转。数据增强使得数据增加了(256-224)x(256-224)x2=2048倍。并且改变图片的RGB通道的强度。
2.ReLU**函数
**函数采用了ReLU,这样就不会出现tanh和sigmoid两端的饱和现象。能够缓梯度反向传播过程中的梯度消失问题。
3.重叠池化
将正常池化(2x2 stride 2)改为重叠池化(3x3 stride 2)可以将top-1和top-5提高0.4%和0.3%
4.局部响应归一化(Local Response Normalization)
LRN有助于泛化,但是LRN的效果其实是有争议的。
LRN其实是和BatchNorm有点像。
LRN思想主要来自于生物学的“侧抑制”
5.Dropout
在fc中应用dropout可以防止过拟合,它以一种很高效地方式结合多个不同的训练模型。
6.GPU实现
AlexNet的GPU实现,大大加快了模型的训练。
下图是AlexNet的架构图:
下图是ZFNet给出的AlexNet架构图(个人感觉这个更清晰):
AlexNet一共8层,5个conv层,3个fc层。
以下是TF+Keras混合编写的AlexNet:
# 这里使用TF和TF.keras.layers实现,可以保证绝对的兼容性
# 其实TF内部tf.layers是一个和keras.layers类似的高层API,也很好用
import tensorflow as tf
keras = tf.keras
from tensorflow.python.keras.layers import Conv2D,MaxPool2D,Dropout,Flatten,Dense
def inference(inputs,
num_classes=1000,
is_training=True,
dropout_keep_prob=0.5):
'''
Inference
inputs: a tensor of images
num_classes: the num of category.
is_training: set ture when it used for training
dropout_keep_prob: the rate of dropout during training
'''
x = inputs
# conv1
x = Conv2D(96, [11,11], 4, activation='relu', name='conv1')(x)
# lrn1
x = tf.nn.local_response_normalization(x, name='lrn1')
# pool1
x = MaxPool2D([3,3], 2, name='pool1')(x)
# conv2
x = Conv2D(256, [5,5], activation='relu', padding='same', name='conv2')(x)
# lrn2
x = tf.nn.local_response_normalization(x, name='lrn2')
# pool2
x = MaxPool2D([3,3], 2, name='pool2')(x)
# conv3
x = Conv2D(384, [3,3], activation='relu', padding='same', name='conv3')(x)
# conv4
x = Conv2D(384, [3,3], activation='relu', padding='same', name='conv4')(x)
# conv5
x = Conv2D(256, [3,3], activation='relu', padding='same', name='conv5')(x)
# pool5
x = MaxPool2D([3,3], 2, name='pool5')(x)
# flatten
x = Flatten(name='flatten')(x)
# dropout
if is_training:
x = Dropout(dropout_keep_prob, name='dropout5')(x)
# fc6
x = Dense(4096, activation='relu', name='fc6')(x)
# dropout
if is_training:
x = Dropout(dropout_keep_prob, name='dropout6')(x)
# fc7
x = Dense(4096, activation='relu', name='fc7')(x)
# fc8
logits = Dense(num_classes, name='logit')(x)
return logits
def build_cost(logits, labels,weight_decay_rate):
'''
cost
logits: predictions
labels: true labels
weight_decay_rate: weight_decay_rate
'''
with tf.variable_scope('costs'):
with tf.variable_scope('xent'):
xent = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(
logits=logits, labels=labels))
with tf.variable_scope('decay'):
costs=[]
for var in tf.trainable_variables():
costs.append(tf.nn.l2_loss(var))
tf.summary.histogram(var.op.name, var) # summary
cost_decay = tf.multiply(weight_decay_rate, tf.add_n(costs))
cost = tf.add(xent,cost_decay)
tf.summary.scalar('cost', cost) # summary
return cost
def build_train_op(cost, lrn_rate, global_step):
'''
train_op
cost: cost
lrn_rate: learning rate
global_step: global step
'''
with tf.variable_scope('train'):
lrn_rate = tf.constant(lrn_rate, tf.float32)
tf.summary.scalar('learning_rate', lrn_rate) # summary
trainable_variables = tf.trainable_variables()
grads = tf.gradients(cost, trainable_variables)
optimizer = tf.train.AdamOptimizer(lrn_rate)
apply_op = optimizer.apply_gradients(
zip(grads, trainable_variables),
global_step=global_step, name='train_step')
train_op = apply_op
return train_op
if __name__ == '__main__':
images = tf.placeholder(tf.float32, [None, 224, 224, 3])
labels = tf.placeholder(tf.float32, [None, 1000])
logits = inference(inputs=images,
num_classes=1000)
print('inference: good job')
cost = build_cost(logits=logits,
labels=labels,
weight_decay_rate=0.0002)
print('build_cost: good job')
global_step = tf.train.get_or_create_global_step()
train_op = build_train_op(cost=cost,
lrn_rate=0.001,
global_step=global_step)
print('build_train_op: good job')
在这里,我们提供inference、build_cost、build_train_op三个函数,这三部分基本上实现上AlexNet的绝大部分。
AlexNet的具体配置:
层 | 配置 |
---|---|
conv1 | aaa@qq.com stride 4, relu |
lrn1 | |
pool1 | 3x3 maxpool, stride 2 |
conv2 | aaa@qq.com stride 1, relu |
lrn2 | |
pool2 | 3x3 maxpool, stride 2 |
conv3 | aaa@qq.com stride 1, relu |
conv4 | aaa@qq.com stride 1, relu |
conv5 | aaa@qq.com stride 1, relu |
pool5 | 3x3 maxpool, stride 2 |
flatten | |
dropout rate | 0.5 |
fc6 | 4096 |
dropout rate | 0.5 |
fc7 | 4096 |
fc8 | 1000 |