pytorch 官方文档 之 菜鸡摸索
程序员文章站
2024-02-01 22:04:22
pytorch tutorials example# Control Flow + Weight Sharing'''As an example of dynamic graphs and weight sharing,we implement a very strange model:a fully-connected ReLUnetwork that on each forward pass choose a random number between 1 and 4 uses that man...
# Control Flow + Weight Sharing
'''
As an example of dynamic graphs and weight sharing,we implement a very strange model:a fully-connected ReLU
network that on each forward pass choose a random number between 1 and 4 uses that many hidden layers,reusing
the same weights multiple times to compute the innermost hidden layers.
For this model we can use normal Python Flow control to implement the loop ,and we can implement weight sharing
among the innermost layers by simply reusing the same Module multiple times when defining the forward pass.
https://blog.csdn.net/bob_chen_csdn/article/details/83661035
'''
import random
import torch
class DynamicNet(torch.nn.Module):
def __init__(self, D_in, H, D_out):
'''
In the constructor we cnostruct three nn.Linear instance that we will use
in the forward
网络构造器:三层的全连接网络
'''
super(DynamicNet,self).__init__()
self.input_linear = torch.nn.Linear(D_in, H) # 1000*100
self.middle_linear = torch.nn.Linear(H, H) # 100*100
self.output_linear = torch.nn.Linear(H, D_out) # 100*10
def forward(self,x):
'''
For THE FORWARD pass of the model, we randomly choose either 0,1,2,3 and reuse
the middle_linear Module that many times to compute hidden layer
representations. 随机的重复使用中间层?
Since each forward pass builds a dynamic computation graph(每一次的前传都会生成计算图,动态的),
we can use normal python control-flow operators(控制流的传输方向)like loops or conditional statements
when defining the forward pass of the model. (使用循环或者条件语句实现)
Here we also see that it is perfectly safe to reuse the same Module many times when defining a computational graph.
This is a big improvement from Lua Torch ,where each Module could be used only once.
'''
params = list(model.parameters())
h_relu = self.input_linear(x).clamp(min = 0) # 送入第一层,得到的结果比0小的全变成0,赋值给relu 相当于激活函数
# print("h_relu:",h_relu)
# for _ in range (random.randint(0, 3)): # 随机多次把上次的结果送入中间层,随机期间中间层的权重不变(权重共享)
for _ in range(3): # 相当于多层神经网络
print("***********")
print("中间层的权重参数名称:",self.middle_linear.parameters())
print("params[2]参数数值:",params[2])
print("params[3]参数数值:", params[3])
h_relu = self.middle_linear(h_relu).clamp( min = 0)
print("-------------------====")
y_pred = self.output_linear(h_relu)
return y_pred
# 定义输入输出和隐藏层神经元个数,以及批处理的大小
N, D_in, H, D_out = 64, 1000, 100, 10
# 定义输入样本和输出样本
x = torch.randn(N, D_in) # N个输入样本,总共N个, 维度是D_in
y = torch.randn(N, D_out) # N个输出样本的
# 构造模型
model =DynamicNet(D_in, H, D_out)
# 看下参数情况
params = list(model.parameters())
print("len(params):",len(params))
print('******p 0*******',params[0],params[0].size(),'******') # FC1 100*1000
print('******p 1*******',params[1],params[1].size(),'******') # 100
print('******p 2*******',params[2],params[2].size(),'******') # FC2 100*100
print('******p 3*******',params[3],params[3].size(),'******') # 100
print('******p 4*******',params[4],params[4].size(),'******') # FC3 10*100
print('******p 5*******',params[5],params[5].size(),'******') # 10
# 构造损失函数,优化器,普通的随机梯度下降
criterion = torch.nn.MSELoss(reduction='sum')
optimizer = torch.optim.SGD(model.parameters(),lr=1e-4,momentum=0.9)
for t in range(50):
# Forward pass:
y_pred = model(x)
# 计算损失
loss = criterion(y_pred,y)
# if t % 100 == 99:
if True:
print(t,loss.item())
# 反传前,梯度置零
optimizer.zero_grad()
loss.backward()
optimizer.step()
循环期间参数不变
参数不同
本文地址:https://blog.csdn.net/z_6_2_0_s/article/details/107475018