【北京大学】4 TensorFlow1.x的反向传播推导与实现
程序员文章站
2022-07-13 12:27:12
...
1 相关概念
(1)反向传播:训练模型参数,在所有参数上用梯度下降,使NN模型在训练数据上的损失函数最小。
(2)损失函数(loss):预测值(y)与已知答案(y_)的差距
(3)均方误差MSE
loss = tf.reduce_mean(tf.square(y_-y))
(4)反向传播训练方法:以减小loss值为优化目标
(5)学习率:决定参数每次更新的幅度
2 神经网络实现过程
(1)准备数据集,提取特征,作为输入喂给神经网络
(2)搭建NN结构,从输入到输出(先搭建计算图,再用会话执行)
(3)大量特征数据喂给NN,迭代优化NN参数
(4)使用训练好的模型预测和分类
3 代码实现
#coding:utf-8
#0导入模块,生成模拟数据集。
#tensorflow学习笔记(北京大学) tf3_6.py 完全解析神经网络搭建学习
#QQ群:476842922(欢迎加群讨论学习
import tensorflow as tf
import numpy as np
BATCH_SIZE = 8
SEED = 23455
rdm = np.random.RandomState(SEED)
X = rdm.rand(32,2)
Y_ = [[int(x0 + x1 < 1)] for (x0, x1) in X]
print("X:\n",X)
print("Y_:\n",Y_)
x = tf.placeholder(tf.float32, shape=(None, 2))#用placeholder实现输入定义,2表示体积和重量两个特征
y_= tf.placeholder(tf.float32, shape=(None, 1))#用placeholder实现占位。1表示输出的特征只有一个特征,就是标签的可能性,只有合格或不合格
w1= tf.Variable(tf.random_normal([2, 3], stddev=1, seed=1))#正态分布随机数。2个输入,对应3个神经元
w2= tf.Variable(tf.random_normal([3, 1], stddev=1, seed=1))#正态分布随机数。3个神经元,对应1个输出
a = tf.matmul(x, w1)#点积
y = tf.matmul(a, w2)#点积
#2定义损失函数及反向传播方法。
loss_mse = tf.reduce_mean(tf.square(y-y_))
train_step = tf.train.GradientDescentOptimizer(0.001).minimize(loss_mse)#0.001是学习率,可以选择不同优化器
#train_step = tf.train.MomentumOptimizer(0.001,0.9).minimize(loss_mse)不同的优化器
#train_step = tf.train.AdamOptimizer(0.001).minimize(loss_mse)不同的优化器
#3生成会话,训练STEPS轮
with tf.Session() as sess:
init_op = tf.global_variables_initializer()#初始化
sess.run(init_op)
# 输出目前(未经训练)的参数取值。
print("w1:\n", sess.run(w1))
print("w2:\n", sess.run(w2))
print("\n")
# 训练模型。
STEPS = 3000
for i in range(STEPS):#3000轮
start = (i*BATCH_SIZE) % 32 #i*8%32 计算取数据集的下标
end = start + BATCH_SIZE #i*8%32+8 计算取标签的下标
sess.run(train_step, feed_dict={x: X[start:end], y_: Y_[start:end]})
if i % 500 == 0:
total_loss = sess.run(loss_mse, feed_dict={x: X, y_: Y_})
print("After %d training step(s), loss_mse on all data is %g" % (i, total_loss))
# 输出训练后的参数取值。
print("\n")
print("w1:\n", sess.run(w1))
print("w2:\n", sess.run(w2))
#,只搭建承载计算过程的
#计算图,并没有运算,如果我们想得到运算结果就要用到“会话 Session()”了。
#√会话(Session): 执行计算图中的节点运算
print("w1:\n", w1)
print("w2:\n", w2)
"""
X:
[[ 0.83494319 0.11482951]
[ 0.66899751 0.46594987]
[ 0.60181666 0.58838408]
[ 0.31836656 0.20502072]
[ 0.87043944 0.02679395]
[ 0.41539811 0.43938369]
[ 0.68635684 0.24833404]
[ 0.97315228 0.68541849]
[ 0.03081617 0.89479913]
[ 0.24665715 0.28584862]
[ 0.31375667 0.47718349]
[ 0.56689254 0.77079148]
[ 0.7321604 0.35828963]
[ 0.15724842 0.94294584]
[ 0.34933722 0.84634483]
[ 0.50304053 0.81299619]
[ 0.23869886 0.9895604 ]
[ 0.4636501 0.32531094]
[ 0.36510487 0.97365522]
[ 0.73350238 0.83833013]
[ 0.61810158 0.12580353]
[ 0.59274817 0.18779828]
[ 0.87150299 0.34679501]
[ 0.25883219 0.50002932]
[ 0.75690948 0.83429824]
[ 0.29316649 0.05646578]
[ 0.10409134 0.88235166]
[ 0.06727785 0.57784761]
[ 0.38492705 0.48384792]
[ 0.69234428 0.19687348]
[ 0.42783492 0.73416985]
[ 0.09696069 0.04883936]]
Y_:
[[1], [0], [0], [1], [1], [1], [1], [0], [1], [1], [1], [0], [0], [0], [0], [0], [0], [1], [0], [0], [1], [1], [0], [1], [0], [1], [1], [1], [1], [1], [0], [1]]
w1:
[[-0.81131822 1.48459876 0.06532937]
[-2.4427042 0.0992484 0.59122431]]
w2:
[[-0.81131822]
[ 1.48459876]
[ 0.06532937]]
After 0 training step(s), loss_mse on all data is 5.13118
After 500 training step(s), loss_mse on all data is 0.429111
After 1000 training step(s), loss_mse on all data is 0.409789
After 1500 training step(s), loss_mse on all data is 0.399923
After 2000 training step(s), loss_mse on all data is 0.394146
After 2500 training step(s), loss_mse on all data is 0.390597
w1:
[[-0.70006633 0.9136318 0.08953571]
[-2.3402493 -0.14641267 0.58823055]]
w2:
[[-0.06024267]
[ 0.91956186]
[-0.0682071 ]]
"""
4 总结
(1)前向传播中:需要定义输入、参数和输出
x =
y_=
w1 =
w2 =
a=
y=
(2)反向传播:定义损失函数、反向传播方法
loss =
train_step =
```python
(3)生成会话,训练Steps轮
```python
with tf.Session() as sess:
init_op = tf.global_variables_initializer()#初始化
sess.run(init_op)
# 训练模型。
STEPS = 3000
for i in range(STEPS):#3000轮
start =
end =
sess.run(train_step, feed_dict:)
下一篇: 利用poi导出excel的代码片段