搭建卷积神经网络 Demo - 实现Cifar-10数据集分类

程序员文章站 2022-03-17 20:53:24

...

实现卷积神经网络的简例

卷积神经网络的一般框架

用简单卷积神经网络实现Cifa -10 数据集的分类

实现卷积神经网络的简例

相较于全连接神经网络而言,卷积神经网络相对进步的地方就是卷积层结构和池化层的引入,这两次都是卷积神经网络的重要组成部分。

卷积神经网络的一般框架

搭建卷积神经网络 Demo - 实现Cifar-10数据集分类

上图展示了一个简单的图像分类的卷积神经网络的架构图。

1. 输入层

输入层时整个神经网络的输入。在用于图像分类问题的卷积神经网络中,它代表的是一张图片的像素矩阵。根据通道数的不同,图片像素矩阵也有不同的深度数值。比如黑白照片只有一个通道,深度是1 ; RGB色彩模式下图像有3个通道,所以深度是3 。

2. 卷积层

上面的示意图中有两个卷积层,这一层完成特征的提取,由一系列的卷积核做卷积得到很多抽象特征,常用的卷积核大小是3×3,5×5的.一般而言,卷积层的单元矩阵会比上一层单元矩阵更深。

3. 池化层

池化层的单元矩阵的深度不会比上一层单元矩阵更深,但是,他能在高度和宽度方向上缩小矩阵大小,完成降维操作。达到减少整个神经网络参数的目的。

4 .全连接层

这一层将多层卷积和池化后的结果进行整合。

5. softmax 层

通过这一层,可以得到输入样例所属种类的概率分布情况。

另外由于卷积操作时特殊的线性变换,所以在卷积层的结果传递到池化层前,需要进行去线性化处理,常用的有 Relu 函数。

搭建简单的卷积神经网络实现 Cifar-10 数据集分类

1. 数据集概述

对CIFAR-10 数据集的分类是机器学习中一个公开的基准测试问题，其任务是对一组32x32RGB的图像进行分类，这些图像涵盖了10个类别：
飞机，汽车，鸟，猫，鹿，狗，青蛙，马，船以及卡车。

搭建卷积神经网络 Demo - 实现Cifar-10数据集分类

Cifar -10 数据集包含 60000 张像素为32×32 的彩色图像,共分为10类。其中用于训练的图像有50000张,用于测试的图像有10000张。该数据集分为5 个训练batch 和 1 个测试 batch ; 对于测试 batch , 每类对象都由1000 幅随机选择的图像组成; 对于整个训练 batch ,每个对象均包含 5000 幅图像。

在本实验中用 cifar-10-binary.tar.gz .

链接：https://pan.baidu.com/s/10KyjBjFDqI-pUy7u9JHQBg
提取码：u72w
百度云盘下载

下载完后解压.

搭建卷积神经网络 Demo - 实现Cifar-10数据集分类

打开后

搭建卷积神经网络 Demo - 实现Cifar-10数据集分类

test_bathc.bin 保存的是10000张测试数据 , 其他 5 个.bin 各保存了10000张训练数据。

2 数据处理

首先开始 Cifar10_data.py 文件的编写。

import os
import tensorflow as tf
num_classes = 10


# 设定用于训练和评估的样本总数
num_examples_pre_epoch_for_train = 50000
num_examples_pre_epoch_for_eval = 10000

# 一个空类,用于返回读取的 caif -10 数据
class CIFAR10Record(object):
    pass

接着 ,在文件中定义一个 read_cifar10() 函数, 用于读取文件队列中的数据.参数是一个队列. 在函数中首先创建一个 CIFAR10Record 类的实例对象 , 属性 height , weight ,depth 分别存储了一幅图像的高度,宽度, 深度。

image_bytes 就是一幅图像的数据长度(3072个值 ) , 图像数据及其对应的 label 数据在 .bin 文件中是一起存储的,定义

record_bytes 为二者的总共长度。

在 .bin 文件中读取固定长度的数据可以通过 FixedLengthRecordReader 类完成。在初始化这个类时要传入长度数据(也就是record_bytes ) ,通过调用该类的的reader() 函数并向 reader() 函数传入文件队列就可以进行读取了。

得到的 value 就是 record_bytes 长度的包含多个label 数据和 image 数据的字符串 ,通过 decode_raw () 函数就可以将字符串解析成图像对应的像素数组 , 并进行数据类型的转换。得到的像素数组可以通过 strided_slice() 函数截取下标[0 ,1) 区间的数据作为 label 数据 ,剩下的作为 image 数据 ,在通过 reshape ( ) 函数将其转化为 3 维的。


#定义读取 Cifar-10 数据的函数
def read_cifar10(file_queue) :
    result = CIFAR10Record()  # 创建一个实例对象

    label_bytes = 1 # 如果是 Cifar-100 数据集 ,此处为 2
    result.height = 32
    result.width = 32
    result.depth = 3 # 深度是 3
    image_bytes = result.height * result.width *result.depth

    #每一个样本都包含一个 label 数据 和 image 数据
    record_bytes = label_bytes + image_bytes

    # FixedLengthRecordReader 类用于读取固定长度字节数信息
    reader = tf.FixedLengthRecordReader(record_bytes=record_bytes)
    result.key ,value = reader.read(file_queue)
    # record_bytes 可以将字符串解析成图像对应的像素数组
    record_bytes = tf.decode_raw(value,tf.uint8)
    # 将 得到的 record_bytes 数组中的第一个元素类型转换成 int32 类型
    result.label = tf.cast(tf.strided_slice(record_bytes,[0],[label_bytes]),tf.int32)

    # 剪切label 之后剩下的就是图片数据,
    depth_major = tf.reshape(tf.strided_slice(record_bytes,[label_bytes],[label_bytes+image_bytes]) ,
                             [result.depth,result.height,result.width] )

    # 通过 transpose() 将  [ depth ,height , width ]  转化成 [ height , width ,depth ] 
    result.uint8image = tf.transpose(depth_major,[1,2,0])
    return result

然后是 inputs() 函数 ,这个函数传入的 data_dir 参数就是存放原始CiFar-10 数据的目录。一开始通过 join() 函数拼接文件完整的路径,作为文件队列创建函数 train.string_input_producder() 的参数

将队列传入上面写好的 read_cifar10() 函数的到存储了 Cifar-10 的result 实例. result.uint8image 属性存储了原始的图像数据,而 label 则存储了对应的标签。

同时为了方便图像数据处理对读取到的原始图像进行处理, 转换成float32 格式。

def inputs(data_dir,batch_size ,distorted) :

    filenames = [os.path.join(data_dir,"data_batch_%d.bin" % i) for i in range(1,6)]

    # 创建一个文件队列,并调用 read_cifar10() 函数读取队列中的文件

    file_queue = tf.train.string_input_producer(filenames)
    read_input = read_cifar10(file_queue)

    reshaped_image = tf.cast(read_input.uint8image,tf.float32)

    num_examples_pre_epoch = num_examples_pre_epoch_for_train

    # distorted  参数来确定是否要对图像数据进行翻转,随机剪切,制造更多的样本

    #对图像数据进行数据增强处理
    if distorted !=None :
        # 将 [32,32,3] 大小的图片随机剪成 [24,24,3]
        cropped_image = tf.image.random_crop(reshaped_image , [24,24,3])


        # 随机左右翻转图片
        flipped_image = tf.image.random_flip_left_right(cropped_image)

        adjusted_brightness = tf.image.random_brightness(flipped_image,max_delta=0.8)

        # 调整对比度

        adjusted_contrcast = tf.image.random_contrast(adjusted_brightness,lower=0.2,upper=1.8)

        # 标准化图片

        float_image = tf.image.per_image_standardization(adjusted_contrcast)


        float_image.set_shape([24,24,3])
        read_input.label.set_shape([1])

        min_queue_examples = int(num_examples_pre_epoch_for_eval *0.4)
        print('Filling queue with %d CIFAR images before starting to train .' % min_queue_examples)
        print('This will take a few minutes.' )

        #使用 shuffle_batch() 函数随机产生一个 batch 的image 和label
        #函数原型
        images_train ,labels_train = tf.train.shuffle_batch([float_image,read_input.label] ,batch_size=batch_size,
                                                            num_threads=16,capacity=min_queue_examples+3*batch_size,
                                                            min_after_dequeue=min_queue_examples)
        return images_train ,tf.reshape(labels_train,[batch_size])
    else:
        resized_image = tf.image.resize_image_with_pad(reshaped_image,24,24)

        # 直接标准化
        float_image = tf.image.per_image_standardization(resized_image)

        # 设置图片数据以及 label 的形状
        float_image.set_shape([24,24,3])
        read_input.label.set_shape([1])

        min_queue_examples = int(num_examples_pre_epoch*0.4)
        images_test , labels_test = tf.train.batch([float_image,read_input.label],
        batch_size = batch_size,num_threads = 16,capacity = min_queue_examples +3*batch_size)
        return images_test,tf.reshape(labels_test,[batch_size])

根据 inputs() 函数的参数取值情况,还需要选择是否对读取到的数据进行数据增强处理。丰富训练数据,提供模型的泛化能力.

1 将图像裁剪成[ 24 ,24 ,3 ] 大小 tf.image.random_crop()

2 随机左右翻转 tf.image.random_flip_left_right()

3 随机亮度调整 tf.image.random_brightness()

4 随机对比度调整 tf.image.random_contrast()

5 图像归一化操作 tf.image_standardization()

6 随机打乱顺序 tf.train.shuffle_batch()

在 inputs() 函数中, 如果不对图像进行数据增强处理相对比较简单。当distorted 为 None 时,就不对数据进行数据增强处理。

1 将图像裁剪成[ 24 ,24 ,3 ] 大小 tf.image.random_crop()

2 图像归一化操作 tf.image_standardization()

下面就是这个 demo 的重要内容:

卷积神经网络的设计和训练。这部分写在 CNN_Cifar-10.py 中。

import tensorflow as tf
import numpy as np
import time
import math
import Cifar10_data as Cd

max_steps = 4000
batch_size =100
num_examples_for_eval = 10000
data_dir ="cifar-10-batches-bin/"  # 文件目录

使用 truncated_normal() 函数创建权重参数 , 加一个 L2 的loss , 相当于L2 正则化。


def variable_with_weight_loss(shape,stddev,wl) :
    var = tf.Variable(tf.truncated_normal(shape,stddev=stddev))

    if wl is not None :

        weights_loss = tf.multiply(tf.nn.l2_loss(var) ,wl,name="weights_loss")
        tf.add_to_collection("losses" , weights_loss)
    return var

接下来, 生成训练集数据 batch 和测试集数据 batch 。

# 用于训练的图片数据,distorted 设置为True ,表示进行数据增强
images_train ,labels_train = Cd.inputs(data_dir=data_dir,batch_size=batch_size,distorted=True )
# 用于测试的图片数据
images_test , labels_test = Cd.inputs(data_dir=data_dir,batch_size=batch_size,distorted = None)

下面就是定义填充张量 x , y_ ;

# 创建placeholder x ,y_
x = tf.placeholder(tf.float32,[batch_size ,24,24,3] )
y_ = tf.placeholder(tf.int32,[batch_size])

定义2层卷积层(包含池化) 和 3 层全连接层 .

#前向传播
# 第一层卷积
#卷积核 :  5×5大小 , 输入通道3 , 64个卷积核 (也就是输出深度为 64 ) 
kernel1 = variable_with_weight_loss(shape=[5,5,3,64] , stddev= 5e-2 , wl=0.0)

conv1 = tf.nn.conv2d(x,kernel1 ,strides=[1,1,1,1] ,padding="SAME")
bias1 = tf.Variable(tf.constant(0.0,shape=[64]))
relu1 = tf.nn.relu(tf.nn.bias_add(conv1,bias1))
pool1 = tf.nn.max_pool(relu1,ksize = [1,3,3,1],strides=[1,2,2,1] ,padding="SAME")

# 第二层
kernel2 = variable_with_weight_loss(shape=[5,5,64,64] , stddev=5e-2 , wl=0.0)
conv2 = tf.nn.conv2d(pool1,kernel2,[1,1,1,1],padding="SAME")
bias2 = tf.Variable(tf.constant(0.1,shape=[64]))
relu2 = tf.nn.relu(tf.nn.bias_add(conv2,bias2))
pool2 = tf.nn.max_pool(relu2,ksize = [1,3,3,1],strides=[1,2,2,1],padding="SAME")

# 接下来是全连接层 (3)

# 拉直数据
reshape = tf.reshape(pool2,[batch_size,-1])
dim = reshape.get_shape()[1].value

# 第一个全连接层
weight1 = variable_with_weight_loss(shape=[dim,384] ,stddev=0.04 ,wl=0.004)
fc_bias1 = tf.Variable(tf.constant(0.1,shape=[384]) )
fc_1 = tf.nn.relu(tf.matmul(reshape,weight1)+fc_bias1)

# 第二个全连接层
weight2 = variable_with_weight_loss(shape=[384,192],stddev=0.04,wl=0.004)
fc_bias2 = tf.Variable(tf.constant(0.1,shape=[192]))
local4 =  tf.nn.relu(tf.matmul(fc_1,weight2)+fc_bias2)

# 第三个全连接
weight3 = variable_with_weight_loss(shape=[192,10],stddev=1/192.0 ,wl=0.0)
fc_bias3 = tf.Variable(tf.constant(0.0,shape=[10]))
result = tf.add(tf.matmul(local4,weight3), fc_bias3 )

整个网络的前向传播完成., 为了计算损失值 ,使用交叉熵 + l2 正则化 .

# 计算损失
cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=result , labels=tf.cast(y_,tf.int64))
weights_with_l2_loss = tf.add_n(tf.get_collection("losses"))
loss = tf.reduce_mean(cross_entropy) +weights_with_l2_loss
train_op = tf.train.AdamOptimizer(1e-3).minimize(loss)
top_k_op = tf.nn.in_top_k(result,y_,1) # 用来计算输出结果的 top k 的准确率, 默认top 1

最后就是创建会话进行训练了。


with tf.Session() as sess:
    tf.global_variables_initializer().run()

    # 开启多线程
    tf.train.start_queue_runners()

    for step in range(max_steps) :
        start_time = time.time()
        image_batch ,label_batch = sess.run([images_train,labels_train])

        _,loss_value = sess.run([train_op,loss] , feed_dict={x:image_batch,y_: label_batch})

        duration = time.time() - start_time

        if step %100 == 0 :
            examples_per_sec = batch_size/duration
            sec_per_batch = float(duration)

            print("step %d ,loss = %.2f (%.1f examples/sec; %.3fsec/batch" % (step,loss_value,examples_per_sec,sec_per_batch))

    num_batch = int(math.ceil(num_examples_for_eval/batch_size))
    true_count = 0
    total_sample_count = num_batch *batch_size

    for j in range(num_batch) :
         image_batch , label_batch = sess.run([images_test,labels_test])
             predictions = sess.run([top_k_op] , feed_dict= {x:image_batch,y_: label_batch})
         true_count +=np.sum(predictions)
    print("accuracy = %.3f%%"% ((true_count/total_sample_count)*100))

本次实验还可以采用指数衰减来进行训练 ,正确率可以达到 86% 的正确率。

上一篇： php模式设计之单例模式

下一篇： Java 实例 - 栈的实现

搭建卷积神经网络 Demo - 实现Cifar-10数据集分类

实现卷积神经网络的简例

卷积神经网络的一般框架

搭建简单的卷积神经网络实现 Cifar-10 数据集分类

TensorFlow2利用猫狗数据集(cats_and_dogs_filtered.zip)实现卷积神经网络完成分类任务

Keras ：利用卷积神经网络CNN对图像进行分类，以mnist数据集为例建立模型并预测

cnn卷积神经网络对cifar数据集实现10分类

TensorFlow深度学习进阶教程：TensorFlow实现CIFAR-10数据集测试的卷积神经网络

Pytorch实现循环神经网络（二）、LSTM实现MNIST手写数据集分类

TensorFlow卷积神经网络MNIST数据集实现示例

Pytorch实现循环神经网络（二）、LSTM实现MNIST手写数据集分类