欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

(2)学习机制---PyTorch’s autograd: Backpropagate all things

程序员文章站 2024-03-14 09:39:10
...
The mechanics of learning
No.2 PyTorch’s autograd: Backpropagate all things

PyTorch tensors can remember where they come from in terms of the operations and parent tensors that originated them, and they can provide the chain of derivatives of such operations with respect to their inputs automatically. You won’t need to derive your model by hand; given a forward expression, no matter how nested, PyTorch provides the gradient of that expression with respect to its input parameters automatically.

import numpy as np
import torch
t_c = torch.tensor([0.5, 14.0, 15.0, 28.0, 11.0, 8.0, 3.0, -4.0, 6.0, 13.0, 21.0])
t_u = torch.tensor([35.7, 55.9, 58.2, 81.9, 56.3, 48.9, 33.9, 21.8, 48.4, 60.4, 68.4])
t_un = 0.1 * t_u
def model(t_u, w, b):
    return w * t_u + b
def loss_fn(t_p, t_c):
    squared_diffs = (t_p - t_c)**2
    return squared_diffs.mean()

Initialize a parameters tensor.

params = torch.tensor([1.0, 0.0], requires_grad=True)

the grad attribute of params contains the derivatives of the loss with respect to each element of params(参数的grad属性包含了关于参数的每个元素的损失的导数)
(2)学习机制---PyTorch’s autograd: Backpropagate all things
每次在计算backward时都需要将前一时刻的梯度归零,否则梯度值会一直累加。

if params.grad is not None:
	params.grad.zero_()
def training_loop(n_epochs, learning_rate, params, t_u, t_c):
    for epoch in range(1, n_epochs + 1):
        if params.grad is not None:  # <1>
            params.grad.zero_()
        
        t_p = model(t_u, *params) 
        loss = loss_fn(t_p, t_c)
        loss.backward()
        params = (params - learning_rate * params.grad).detach().requires_grad_()

        if epoch % 500 == 0:
            print('Epoch %d, Loss %f' % (epoch, float(loss)))
            
    return params

training_loop(
    n_epochs = 5000, 
    learning_rate = 1e-2, 
    params = torch.tensor([1.0, 0.0], requires_grad=True), # <1> 
    t_u = t_un, # <2> 
    t_c = t_c)

detach the new params tensor from the computation graph associated with its update expression by calling .detatch(). This way, params effectively loses the memory of the operations that generated it. Then you can reenable tracking by calling.requires_grad_(), an in_place operation (see the trailing _) that reactivates autograd for the tensor.