欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

tensorflow基础知识6,防止过拟合,dropout

程序员文章站 2022-07-13 10:29:53
...

防止过拟合,用dropout()函数,取值0-1.0,表示多少比例的神经元在工作。

1、以下是100%的神经元在工作,出现过拟合现象。

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

#载入数据
mnist=input_data.read_data_sets('mnist_data',one_hot=True)#noe_hot把像素点都转变成0或1的形式

#每个批次的大小,训练模型时,一次放入一批次
batch_size=100 #一批次100张图
#计算一共有多少个批次
n_batch=mnist.train.num_examples//batch_size# //是整除,得到批次数

#定义两个placeholder,添加一个placeholder
x=tf.placeholder(tf.float32,[None,784])#建立一个占位符,None是图片数,784是每幅图的像素个数
y=tf.placeholder(tf.float32,[None,10])# 标签,建立一个占位符,10是指0-9十个数
keep_prob=tf.placeholder(tf.float32)

'''
#创建一个简单的神经网络,输入层784个神经元,输出层10个神经元,不设隐藏层
#这种初始化为0的方法不是很好,可以改进一下
W=tf.Variable(tf.zeros([784,10]))#权值,设一个变量,置0
b=tf.Variable(tf.zeros([10]))#偏置值
prediction=tf.nn.softmax(tf.matmul(x,W)+b)#信号总和,经过softmax函数(**函数)转化成概率值
'''

W1=tf.Variable(tf.truncated_normal([784,2000],stddev=0.1))#truncate_normal截断的正态分布初始化,标准差是0.1
b1=tf.Variable(tf.zeros([2000])+0.1)
L1=tf.nn.tanh(tf.matmul(x,W1)+b1)#**函数采用双曲正切函数,中间层神经元的输出
L1_drop=tf.nn.dropout(L1,keep_prob)#keep_prob代表多少比例的神经元在工作,比如1.0代表所有神经元都在工作

W2=tf.Variable(tf.truncated_normal([2000,2000],stddev=0.1))
b2=tf.Variable(tf.zeros([2000])+0.1)
L2=tf.nn.tanh(tf.matmul(L1_drop,W2)+b2)
L2_drop=tf.nn.dropout(L2,keep_prob)

W3=tf.Variable(tf.truncated_normal([2000,1000],stddev=0.1))
b3=tf.Variable(tf.zeros([1000])+0.1)
L3=tf.nn.tanh(tf.matmul(L2_drop,W3)+b3)
L3_drop=tf.nn.dropout(L3,keep_prob)

W4=tf.Variable(tf.truncated_normal([1000,10],stddev=0.1))
b4=tf.Variable(tf.zeros([10])+0.1)
prediction=tf.nn.softmax(tf.matmul(L3_drop,W4)+b4)

#二次代价函数
#loss =tf.reduce_mean(tf.square(y-prediction))

#使用交叉熵代价函数
loss=tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y,logits=prediction))

#使用梯度下降法
train_step=tf.train.GradientDescentOptimizer(0.2).minimize(loss)

#初始化变量
init=tf.global_variables_initializer()

#训练好后求准确率,结果存放在一个布尔型列表中,argmax返回一维张量中最大的值所在的位置
correct_prediction=tf.equal(tf.argmax(y,1),tf.argmax(prediction,1))#argmax函数是对行或列计算最大值,1表示按行,0表示按列,找到最大概率标签的位置。 equal函数是比较两个参数大小,相等的话返回True,不相等返回False

#求准确率
accuracy=tf.reduce_mean(tf.cast(correct_prediction,tf.float32))#cast()是类型转换函数,把布尔型参数转换为32位古典型,然后求平均值。true变成1.0,flse变成0
#Boolean→数值型:True转换为-1,False转换为0。数值型→Boolean:0转换为False,其他转换为True

with tf.Session() as sess:
    sess.run(init)#初始化变量
    for epoch in range(31):#迭代21个周期,把所有图片训练21次
        for batch in range(n_batch):
            batch_xs,batch_ys=mnist.train.next_batch(batch_size)#分配100张图片,图片数据保存在batch_xs,标签保存在batch_ys
            sess.run(train_step,feed_dict={x:batch_xs,y:batch_ys,keep_prob:1.0})

        test_acc=sess.run(accuracy,feed_dict={x:mnist.test.images,y:mnist.test.labels,keep_prob:1.0})#x输入测试图片,从而得到prediciton的y,从而和label y 对比
        train_acc=sess.run(accuracy,feed_dict={x:mnist.train.images,y:mnist.train.labels,keep_prob:1.0})#通过对比发现train和test之间因为过拟合存在较大误差
        print('Iter'+str(epoch)+',Testing Accuracy'+str(test_acc)+',Training Accuracy'+str(train_acc))



Iter0,Testing Accuracy0.9458,Training Accuracy0.95698184
Iter1,Testing Accuracy0.9593,Training Accuracy0.9751091
Iter2,Testing Accuracy0.9633,Training Accuracy0.98274547
Iter3,Testing Accuracy0.9654,Training Accuracy0.9861091
Iter4,Testing Accuracy0.9671,Training Accuracy0.9881455
Iter5,Testing Accuracy0.9671,Training Accuracy0.9895273
Iter6,Testing Accuracy0.9681,Training Accuracy0.9904909
Iter7,Testing Accuracy0.9687,Training Accuracy0.9911818
Iter8,Testing Accuracy0.9696,Training Accuracy0.9916546
Iter9,Testing Accuracy0.9688,Training Accuracy0.9921273
Iter10,Testing Accuracy0.9689,Training Accuracy0.99256366
Iter11,Testing Accuracy0.969,Training Accuracy0.99285454
Iter12,Testing Accuracy0.9703,Training Accuracy0.9932182
Iter13,Testing Accuracy0.9698,Training Accuracy0.99363637
Iter14,Testing Accuracy0.9699,Training Accuracy0.9938909
Iter15,Testing Accuracy0.971,Training Accuracy0.99416363
Iter16,Testing Accuracy0.9706,Training Accuracy0.9944364
Iter17,Testing Accuracy0.9706,Training Accuracy0.9945091
Iter18,Testing Accuracy0.9715,Training Accuracy0.99472725
Iter19,Testing Accuracy0.9711,Training Accuracy0.9949091
Iter20,Testing Accuracy0.9722,Training Accuracy0.99505454
Iter21,Testing Accuracy0.9724,Training Accuracy0.99512726
Iter22,Testing Accuracy0.9722,Training Accuracy0.9951818
Iter23,Testing Accuracy0.9721,Training Accuracy0.99527276
Iter24,Testing Accuracy0.9722,Training Accuracy0.99529094
Iter25,Testing Accuracy0.9717,Training Accuracy0.99536365
Iter26,Testing Accuracy0.9722,Training Accuracy0.9954182
Iter27,Testing Accuracy0.9711,Training Accuracy0.9954727
Iter28,Testing Accuracy0.9719,Training Accuracy0.99554545
Iter29,Testing Accuracy0.9722,Training Accuracy0.9956
Iter30,Testing Accuracy0.9722,Training Accuracy0.99563634



2、以下是70%的神经元在工作,避免出现过拟合现象。

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

#载入数据
mnist=input_data.read_data_sets('mnist_data',one_hot=True)#noe_hot把像素点都转变成0或1的形式

#每个批次的大小,训练模型时,一次放入一批次
batch_size=100 #一批次100张图
#计算一共有多少个批次
n_batch=mnist.train.num_examples//batch_size# //是整除,得到批次数

#定义两个placeholder,添加一个placeholder
x=tf.placeholder(tf.float32,[None,784])#建立一个占位符,None是图片数,784是每幅图的像素个数
y=tf.placeholder(tf.float32,[None,10])# 标签,建立一个占位符,10是指0-9十个数
keep_prob=tf.placeholder(tf.float32)

'''
#创建一个简单的神经网络,输入层784个神经元,输出层10个神经元,不设隐藏层
#这种初始化为0的方法不是很好,可以改进一下
W=tf.Variable(tf.zeros([784,10]))#权值,设一个变量,置0
b=tf.Variable(tf.zeros([10]))#偏置值
prediction=tf.nn.softmax(tf.matmul(x,W)+b)#信号总和,经过softmax函数(**函数)转化成概率值
'''

W1=tf.Variable(tf.truncated_normal([784,2000],stddev=0.1))#truncate_normal截断的正态分布初始化,标准差是0.1
b1=tf.Variable(tf.zeros([2000])+0.1)
L1=tf.nn.tanh(tf.matmul(x,W1)+b1)#**函数采用双曲正切函数,中间层神经元的输出
L1_drop=tf.nn.dropout(L1,keep_prob)#keep_prob代表多少比例的神经元在工作,比如1.0代表所有神经元都在工作

W2=tf.Variable(tf.truncated_normal([2000,2000],stddev=0.1))
b2=tf.Variable(tf.zeros([2000])+0.1)
L2=tf.nn.tanh(tf.matmul(L1_drop,W2)+b2)
L2_drop=tf.nn.dropout(L2,keep_prob)

W3=tf.Variable(tf.truncated_normal([2000,1000],stddev=0.1))
b3=tf.Variable(tf.zeros([1000])+0.1)
L3=tf.nn.tanh(tf.matmul(L2_drop,W3)+b3)
L3_drop=tf.nn.dropout(L3,keep_prob)

W4=tf.Variable(tf.truncated_normal([1000,10],stddev=0.1))
b4=tf.Variable(tf.zeros([10])+0.1)
prediction=tf.nn.softmax(tf.matmul(L3_drop,W4)+b4)

#二次代价函数
#loss =tf.reduce_mean(tf.square(y-prediction))

#使用交叉熵代价函数
loss=tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y,logits=prediction))

#使用梯度下降法
train_step=tf.train.GradientDescentOptimizer(0.2).minimize(loss)

#初始化变量
init=tf.global_variables_initializer()

#训练好后求准确率,结果存放在一个布尔型列表中,argmax返回一维张量中最大的值所在的位置
correct_prediction=tf.equal(tf.argmax(y,1),tf.argmax(prediction,1))#argmax函数是对行或列计算最大值,1表示按行,0表示按列,找到最大概率标签的位置。 equal函数是比较两个参数大小,相等的话返回True,不相等返回False

#求准确率
accuracy=tf.reduce_mean(tf.cast(correct_prediction,tf.float32))#cast()是类型转换函数,把布尔型参数转换为32位古典型,然后求平均值。true变成1.0,flse变成0
#Boolean→数值型:True转换为-1,False转换为0。数值型→Boolean:0转换为False,其他转换为True

with tf.Session() as sess:
    sess.run(init)#初始化变量
    for epoch in range(31):#迭代21个周期,把所有图片训练21次
        for batch in range(n_batch):
            batch_xs,batch_ys=mnist.train.next_batch(batch_size)#分配100张图片,图片数据保存在batch_xs,标签保存在batch_ys
            sess.run(train_step,feed_dict={x:batch_xs,y:batch_ys,keep_prob:0.7})

        test_acc=sess.run(accuracy,feed_dict={x:mnist.test.images,y:mnist.test.labels,keep_prob:1.0})#x输入测试图片,从而得到prediciton的y,从而和label y 对比
        train_acc=sess.run(accuracy,feed_dict={x:mnist.train.images,y:mnist.train.labels,keep_prob:1.0})#通过对比发现train和test之间因为过拟合存在较大误差
        print('Iter'+str(epoch)+',Testing Accuracy'+str(test_acc)+',Training Accuracy'+str(train_acc))






Iter0,Testing Accuracy0.916,Training Accuracy0.91083634
Iter1,Testing Accuracy0.9292,Training Accuracy0.9274909
Iter2,Testing Accuracy0.9374,Training Accuracy0.93578184
Iter3,Testing Accuracy0.9402,Training Accuracy0.94092727
Iter4,Testing Accuracy0.9416,Training Accuracy0.9450182
Iter5,Testing Accuracy0.9462,Training Accuracy0.9480364
Iter6,Testing Accuracy0.9481,Training Accuracy0.95181817
Iter7,Testing Accuracy0.952,Training Accuracy0.95461816
Iter8,Testing Accuracy0.9545,Training Accuracy0.9568
Iter9,Testing Accuracy0.9541,Training Accuracy0.9579273
Iter10,Testing Accuracy0.9564,Training Accuracy0.9604
Iter11,Testing Accuracy0.9583,Training Accuracy0.96183634
Iter12,Testing Accuracy0.9588,Training Accuracy0.9624909
Iter13,Testing Accuracy0.9585,Training Accuracy0.9642182
Iter14,Testing Accuracy0.9607,Training Accuracy0.96572727
Iter15,Testing Accuracy0.961,Training Accuracy0.9670182
Iter16,Testing Accuracy0.9626,Training Accuracy0.9672
Iter17,Testing Accuracy0.9638,Training Accuracy0.9692909
Iter18,Testing Accuracy0.9642,Training Accuracy0.97016364
Iter19,Testing Accuracy0.9648,Training Accuracy0.9711273
Iter20,Testing Accuracy0.9649,Training Accuracy0.9715273
Iter21,Testing Accuracy0.966,Training Accuracy0.97185457
Iter22,Testing Accuracy0.9689,Training Accuracy0.97365457
Iter23,Testing Accuracy0.9678,Training Accuracy0.9742909
Iter24,Testing Accuracy0.9682,Training Accuracy0.9750909
Iter25,Testing Accuracy0.9691,Training Accuracy0.9750909
Iter26,Testing Accuracy0.9695,Training Accuracy0.9761818
Iter27,Testing Accuracy0.9694,Training Accuracy0.9763455
Iter28,Testing Accuracy0.9704,Training Accuracy0.9769273
Iter29,Testing Accuracy0.9703,Training Accuracy0.97769094
Iter30,Testing Accuracy0.9687,Training Accuracy0.9780909