tensorflow入门教程(三)两层卷积神经网络模型将MNIST未识别对的图片筛选出来

程序员文章站 2024-03-14 21:30:23

...

1、概述

昨天学习了怎么使用Softmax回归模型和两层卷积神经网络模型训练MNIST，虽然使用神经网络能达到99.31%的正确率，但是我比较好奇是怎样杀马特的字能让它认错字？难道还有比我的字还丑的？所以这次笔记我打算将它们保存下来看看。

2、tensorflow代码实现

为简单起见直接拷贝昨天的代码来该，因为昨天已经将训练好的模型保存下来了，所以这次就不需要重新训练了。

2.1、添加保存图片的路径

#图片保存的路径，如果不存在就创建
image_path = './image_path/'
if os.path.exists(image_path) == False:
    os.mkdir(image_path)

2.2、查看训练好的模型是否存在，不存在则重新训练

#判断训练好的模型是否存在，如果不存在，则重新训练
is_train_model_exist = True

#查看训练好的模型是否存在,
#这里简单的认为mnist_conv.ckpt.index文件存在则模型存在
savePath = './mnist_conv/'
saveFile = savePath + 'mnist_conv.ckpt'
if os.path.exists(saveFile + '.index') == False:
    print('Not found the CKPT files!')
    is_train_model_exist = False
......
#导入保存的训练数据
if is_train_model_exist == True:
    saver.restore(sess, saveFile)
    print ("start testing...")
else:
    # 初始化所有变量
    sess.run(tf.global_variables_initializer())
    # 训练两万次
    for i in range(20000):
        # 每次获取50张图片数据和对应的标签
        batch = mnist.train.next_batch(50)
        # 每训练100次，我们打印一次训练的准确率
        if i % 100 == 0:
            train_accuracy = sess.run(accuracy, feed_dict={x: batch[0], y_: batch[1], keep_prob: 1.0})
            print("step %d, training accuracy %g" % (i, train_accuracy))
        # 这里是真的训练，将数据传入
        sess.run(train_step, feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})
    print ("end train, start testing...")

if is_train_model_exist == False:
    # 最后，将会话保存下来
    saver.save(sess, saveFile)

2.3、开始测试，将判断错的以JPG图片的形式保存

#开始测试
for i in range(mnist.test.labels.shape[0]):
    #为了方便判断，每次判断一张图片
    batch = mnist.test.next_batch(1)
    result = sess.run(correct_prediction, feed_dict={x: batch[0], y_: batch[1], keep_prob: 1.0})

    if result[0] == False:
        #用来查看机器把这张图片识别成什么数字的
        result1 = sess.run(y_conv, feed_dict={x: batch[0], keep_prob: 1.0})
        # print (sess.run(tf.argmax(batch[1], 1))[0])
        image = batch[0].reshape(28, 28)
        #注意，这里想获取tf.argmax(batch[1], 1))里的内容，一定要先用sess.run，只有run了才真正的运算
        filename = image_path + 'image_%d_%d_%d.jpg' % (i, sess.run(tf.argmax(batch[1], 1))[0], sess.run(tf.argmax(result1, 1))[0])
        sm.toimage(image).save(filename)

2.4、运行结果

如果模型存在：

aaa@qq.com ~/tensorflow-master/demo/unit1 $ python demo6.py

Extracting mnist_data/train-images-idx3-ubyte.gz

Extracting mnist_data/train-labels-idx1-ubyte.gz

Extracting mnist_data/t10k-images-idx3-ubyte.gz

Extracting mnist_data/t10k-labels-idx1-ubyte.gz

start testing...

如果模型不存在：

Extracting mnist_data/train-images-idx3-ubyte.gz

Extracting mnist_data/train-labels-idx1-ubyte.gz

Extracting mnist_data/t10k-images-idx3-ubyte.gz

Extracting mnist_data/t10k-labels-idx1-ubyte.gz

Not found the CKPT files!

step 0, training accuracy 0.04

step 100, training accuracy 0.82

step 200, training accuracy 0.9

.....

step 19700, training accuracy 1

step 19800, training accuracy 1

step 19900, training accuracy 1

end train, start testing...

得到的识别错误的图片：

tensorflow入门教程(三)两层卷积神经网络模型将MNIST未识别对的图片筛选出来

2.5、完整代码

# coding: utf-8
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
import os
import scipy.misc as sm

os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'

mnist = input_data.read_data_sets('mnist_data', one_hot=True)

#初始化过滤器
def weight_variable(shape):
    return tf.Variable(tf.truncated_normal(shape, stddev=0.1))

#初始化偏置，初始化时，所有值是0.1
def bias_variable(shape):
    return tf.Variable(tf.constant(0.1, shape=shape))

#卷积运算，strides表示每一维度滑动的步长，一般strides[0]=strides[3]=1
#第四个参数可选"Same"或"VALID"，“Same”表示边距使用全0填充
def conv2d(x, W):
    return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding="SAME")

#池化运算
def max_pool_2x2(x):
    return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding="SAME")


#创建x占位符，用于临时存放MNIST图片的数据，
# [None, 784]中的None表示不限长度，而784则是一张图片的大小（28×28=784）
x = tf.placeholder(tf.float32, [None, 784])
#y_存的是实际图像的标签，即对应于每张输入图片实际的值
y_ = tf.placeholder(tf.float32, [None, 10])

#将图片从784维向量重新还原为28×28的矩阵图片,
# 原因参考卷积神经网络模型图，最后一个参数代表深度，
# 因为MNIST是黑白图片，所以深度为1,
# 第一个参数为-1,表示一维的长度不限定，这样就可以灵活设置每个batch的训练的个数了
x_image = tf.reshape(x, [-1, 28, 28, 1])


#第一层卷积
#将过滤器设置成5×5×1的矩阵，
#其中5×5表示过滤器大小，1表示深度，因为MNIST是黑白图片只有一层。所以深度为1
#32表示卷积在经过每个5×5大小的过滤器后可以算出32个特征，即经过卷积运算后，输出深度为32
W_conv1 = weight_variable([5, 5, 1, 32])
#有多少个输出通道数量就有多少个偏置
b_conv1 = bias_variable([32])
#使用conv2d函数进行卷积计算，然后再用ReLU作为**函数
h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
#卷积以后再经过池化操作
h_pool1 = max_pool_2x2(h_conv1)

#第二层卷积
#因为经过第一层卷积运算后，输出的深度为32,所以过滤器深度和下一层输出深度也做出改变
W_conv2 = weight_variable([5, 5, 32, 64])
b_conv2 = bias_variable([64])
h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
h_pool2 = max_pool_2x2(h_conv2)

#全连接层
#经过两层卷积后，图片的大小为7×7（第一层池化后输出为（28/2）×（28/2），
#第二层池化后输出为（14/2）×（14/2））,深度为64，
#我们在这里加入一个有1024个神经元的全连接层，所以权重W的尺寸为[7 * 7 * 64, 1024]
W_fc1 = weight_variable([7 * 7 * 64, 1024])
#偏置的个数和权重的个数一致
b_fc1 = bias_variable([1024])
#这里将第二层池化后的张量（长：7 宽：7 深度：64） 变成向量（跟上一节的Softmax模型的输入一样了）
h_pool2_flat = tf.reshape(h_pool2, [-1, 7 * 7 * 64])
#使用ReLU**函数
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)

#dropout
#为了减少过拟合，我们在输出层之前加入dropout
keep_prob = tf.placeholder(tf.float32)
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

#输出层
#全连接层输入的大小为1024,而我们要得到的结果的大小是10（0～9），
# 所以这里权重W的尺寸为[1024, 10]
W_fc2 = weight_variable([1024, 10])
b_fc2 = bias_variable([10])
#最后都要经过Softmax函数将输出转化为概率问题
y_conv = tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)

#损失函数和损失优化
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y_conv)))
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)

#测试准确率,跟Softmax回归模型的一样
correct_prediction = tf.equal(tf.argmax(y_conv, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))


#判断训练好的模型是否存在，如果不存在，则重新训练
is_train_model_exist = True

#查看训练好的模型是否存在,
#这里简单的认为mnist_conv.ckpt.index文件存在则模型存在
savePath = './mnist_conv/'
saveFile = savePath + 'mnist_conv.ckpt'
if os.path.exists(saveFile + '.index') == False:
    print('Not found the CKPT files!')
    is_train_model_exist = False

saver = tf.train.Saver()

#图片保存的路径，如果不存在就创建
image_path = './image_path/'
if os.path.exists(image_path) == False:
    os.mkdir(image_path)

#开始训练
with tf.Session() as sess:
    #导入保存的训练数据
    if is_train_model_exist == True:
        saver.restore(sess, saveFile)
        print ("start testing...")
    else:
        # 初始化所有变量
        sess.run(tf.global_variables_initializer())
        # 训练两万次
        for i in range(20000):
            # 每次获取50张图片数据和对应的标签
            batch = mnist.train.next_batch(50)
            # 每训练100次，我们打印一次训练的准确率
            if i % 100 == 0:
                train_accuracy = sess.run(accuracy, feed_dict={x: batch[0], y_: batch[1], keep_prob: 1.0})
                print("step %d, training accuracy %g" % (i, train_accuracy))
            # 这里是真的训练，将数据传入
            sess.run(train_step, feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})
        print ("end train, start testing...")

    if is_train_model_exist == False:
        # 最后，将会话保存下来
        saver.save(sess, saveFile)

    #开始测试
    for i in range(mnist.test.labels.shape[0]):
        #为了方便判断，每次判断一张图片
        batch = mnist.test.next_batch(1)
        result = sess.run(correct_prediction, feed_dict={x: batch[0], y_: batch[1], keep_prob: 1.0})

        if result[0] == False:
            #用来查看机器把这张图片识别成什么数字的
            result1 = sess.run(y_conv, feed_dict={x: batch[0], keep_prob: 1.0})
            # print (sess.run(tf.argmax(batch[1], 1))[0])
            image = batch[0].reshape(28, 28)
            #注意，这里想获取tf.argmax(batch[1], 1))里的内容，一定要先用sess.run，只有run了才真正的运算
            filename = image_path + 'image_%d_%d_%d.jpg' % (i, sess.run(tf.argmax(batch[1], 1))[0], sess.run(tf.argmax(result1, 1))[0])
            sm.toimage(image).save(filename)

总结：

可以看到，有些图片我们一眼就能识别是什么，但是机器却识别不出来，有些图片确实杀马特的我也看不出来，有些我也认错......这也体现了，人工智能现阶段作为辅助工具去减轻人力负担还行，还达不到完全依靠机器的地步。

注：以上纯属个人学习笔记，如有不对之处，望指点一二