欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

基于LSTM模型的MNIST分类

程序员文章站 2024-03-25 08:00:51
...

设置RNN的参数

  这次我们使用RNN来进行分类的训练,继续使用手写数字MNIST数据集。让RNN从每张图片的第一行像素读到最后一行,然后再进行分类判断。接下来导入MNIST数据并确定RNN的各种参数:

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

mnist = input_data.read_data_sets('./MNIST_data', one_hot=True)

lr = 0.001  # learning rate
training_iters = 100000  # train step(上限)
batch_size = 128
n_inputs = 28  # 每一步输入的序列长度为28(img shape: 28*28)
n_steps = 28  # 输入的步数是28步
n_hidden_units = 128  # 隐藏层的神经元个数
n_classes = 10  # 分类的类别

接着定义xyplaceholder以及weightsbiases的初始状况:

# 输入数据占位符
x = tf.placeholder(tf.float32, [None, n_steps, n_inputs])
y = tf.placeholder(tf.float32, [None, n_classes])

weights = {  # 定义权重
    'in': tf.Variable(tf.random_normal([n_inputs, n_hidden_units])),  # shape (28, 128)
    'out': tf.Variable(tf.random_normal([n_hidden_units, n_classes]))  # shape (128, 10)
}

biases = {
    'in': tf.Variable(tf.constant(0.1, shape=[n_hidden_units, ])),  # shape (128, )
    'out': tf.Variable(tf.constant(0.1, shape=[n_classes, ]))  # shape (10, )
}

定义RNN的主体结构

  接着开始定义RNN主体结构,这个RNN总共有3个组成部分(input_layercelloutput_layer)。首先我们先定义input_layer

def RNN(X, weights, biases):
    # 原始的X是3维数据,我们需要把它变成2维数据,才能使用weights的矩阵乘法
    X = tf.reshape(X, [-1, n_inputs])  # X ==> (128 batches * 28 steps, 28 inputs)

    # 进入隐藏层
    # X_in = W * X + b
    # X_in = (128 batches, 28 steps, 128 hidden)
    X_in = tf.matmul(X, weights['in']) + biases['in']
    # X_in ==> (128 batches, 28 steps, 128 hidden) 换回3维
    X_in = tf.reshape(X_in, [-1, n_steps, n_hidden_units])

接着是cell中的计算,这里使用tf.nn.dynamic_rnn(cell, inputs)。因TensorFlow版本升级原因,state_is_tuple = True将在之后的版本中变为默认。对于lstm来说,state可被分为(c_state, h_state)

# 这里采用基本的LSTM循环网络单元(basic LSTM Cell)
lstm_cell = tf.contrib.rnn.BasicLSTMCell(n_hidden_units, forget_bias=1.0, state_is_tuple=True)
# 初始化为零值,lstm单元由两个部分组成,即(c_state, h_state)
init_state = lstm_cell.zero_state(batch_size, dtype=tf.float32)

如果使用tf.nn.dynamic_rnn(cell, inputs),我们要确定inputs的格式。tf.nn.dynamic_rnn中的time_major参数会针对不同inputs格式有不同的值:

  • 如果inputs(batches, steps, inputs)time_majorFalse
  • 如果inputs(steps, batches, inputs)time_majorTrue
# dynamic_rnn接收张量(batch, steps, inputs)或者(steps, batch, inputs)作为X_in
outputs, final_state = tf.nn.dynamic_rnn(lstm_cell, X_in, initial_state=init_state, time_major=False)

return值的求解如下,直接调用final_state中的h_state(final_state[1])来进行运算:

results = tf.matmul(final_state[1], weights['out']) + biases['out']

RNN函数的最后输出result

return results

定义好了RNN主体结构后,我们就可以来计算costtrain_op

pred = RNN(x, weights, biases)
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=y))
train_op = tf.train.AdamOptimizer(lr).minimize(cost)

训练RNN

  训练过程如下:

correct_pred = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

init = tf.global_variables_initializer()

with tf.Session() as sess:
    sess.run(init)
    step = 0

    while step * batch_size < training_iters:
        batch_xs, batch_ys = mnist.train.next_batch(batch_size)
        batch_xs = batch_xs.reshape([batch_size, n_steps, n_inputs])
        sess.run([train_op], feed_dict={x: batch_xs, y: batch_ys, })

        if step % 20 == 0:
            print(sess.run(accuracy, feed_dict={x: batch_xs, y: batch_ys, }))

        step += 1
相关标签: 深度学习