欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

autograd自动求导-Pytorch

程序员文章站 2024-03-14 09:34:40
...

1、torch.autograd.backward

功能:自动求取梯度

  • Tensor:用于求导的张量,如:loss

  • retain_graph:保存计算图

  • create_graph:创建导数计算图,创建导数计算图,用于高阶求导

  • grad_tensors:多梯度权重

autograd:计算图与梯度求导

autograd自动求导-Pytorch
autograd自动求导-Pytorch

2、torch.autograd.grad

功能:求取梯度

• outputs: 用于求导的张量,如 loss

• inputs : 需要梯度的张量

• create_graph : 创建导数计算图,用于高阶求导

• retain_graph : 保存计算图

• grad_outputs:多梯度权重

3、autograd注意事项:

1. 梯度不自动清零

2. 依赖于叶子结点的结点,requires_grad默认为True

3. 叶子结点不可执行in-place

import torch
torch.manual_seed(10)
help(torch.autograd.backward)
    Arguments:
        tensors (sequence of Tensor): Tensors of which the derivative will be
            computed.
        grad_tensors (sequence of (Tensor or None)): The "vector" in the Jacobian-vector
            product, usually gradients w.r.t. each element of corresponding tensors.
            None values can be specified for scalar Tensors or ones that don't require
            grad. If a None value would be acceptable for all grad_tensors, then this
            argument is optional.
        retain_graph (bool, optional): If ``False``, the graph used to compute the grad
            will be freed. Note that in nearly all cases setting this option to ``True``
            is not needed and often can be worked around in a much more efficient
            way. Defaults to the value of ``create_graph``.
        create_graph (bool, optional): If ``True``, graph of the derivative will
            be constructed, allowing to compute higher order derivative products.
            Defaults to ``False``.
w = torch.tensor([1.],requires_grad=True)
x = torch.tensor([2.],requires_grad=True)

a = torch.add(w,x) # y = w+x
b = torch.add(w,1) # y1 = w+1

y = torch.mul(a,b) # y = (w+x)*(w+1)

# 保存计算图,可以让下面可以继续求反向传播
# 因为不这么做的话,计算图是会自动释放的,下一次用不了
y.backward(retain_graph=True)
print(w.grad)
y.backward(retain_graph=True)
print(w.grad)
y.backward()
print(w.grad)
tensor([5.])
tensor([10.])
tensor([15.])
flag = True
if flag:
    w = torch.tensor([1.],requires_grad=True)
    x = torch.tensor([2.],requires_grad=True)
    
    a = torch.add(w,x) # w+x
    b = torch.add(w,1) # w+1
    
    y0 = torch.mul(a,b) # (w+x)*(w+1)=>w^2 + x*w + w +x=>dy0/dw=2w+x+1=5
    y1 = torch.add(a,b) # (w+x)+(w+1)=>2w + x + 1=>dy1/dw=2
    
    loss = torch.cat([y0,y1],dim=0) # [y0,y1]
    grad_tensors = torch.tensor([1.,3.]) # y0*1+y1*2=5*1+2*2=9.
    
    # gradient 传入 torch.autograd.backward()中的grad_tensors
    loss.backward(gradient=grad_tensors) # 表示多梯度权重,也就是对梯度进行加权值,需要提前给出
    
    print(w.grad)
tensor([11.])
flag = True
if flag:
    x = torch.tensor([3.],requires_grad=True)
    y = torch.pow(x,2) #求平方操作
    
    # create_graph创建导数计算图,用于高阶求导,如果为false的话,下面的导数计算无法进行
    # 而且还会保留grad_fn记录操作过程
    grad1 = torch.autograd.grad(y,x,
                    create_graph=True) # grad_1 = dy/dx = 2x = 2 * 3 = 6
    print(grad1)
    
    grad2 = torch.autograd.grad(grad1[0],x,
                    create_graph=True) # grad_2 = d(dy/dx)/dx = d(2x)/dx = 2
    print(grad2)
(tensor([6.], grad_fn=<MulBackward0>),)
(tensor([2.], grad_fn=<MulBackward0>),)
flag = True
if flag:

    w = torch.tensor([1.], requires_grad=True)
    x = torch.tensor([2.], requires_grad=True)

    for i in range(4):
        a = torch.add(w, x)
        b = torch.add(w, 1)
        y = torch.mul(a, b)

        y.backward()
        print(w.grad)

        # 梯度清零,不清零就会在之前的基础上进行累加
        """
            tensor([5.])
            tensor([10.])
            tensor([15.])
            tensor([20.])
        """
        # _下划线代表占位操作
        w.grad.zero_()

tensor([5.])
tensor([5.])
tensor([5.])
tensor([5.])
flag = True
if flag:
    # 一个不行后面都会受到影响
    w = torch.tensor([1.], requires_grad=False)
    x = torch.tensor([2.], requires_grad=False)

    a = torch.add(w, x)
    b = torch.add(w, 1)
    y = torch.mul(a, b)

    print(a.requires_grad, b.requires_grad, y.requires_grad)
False False False
flag = True
if flag:

    a = torch.ones((1, ))
    print(id(a), a)

    # 这个id会发生变化
    a = a + torch.ones((1, ))
    print(id(a), a)

    # 这个是占位操作,不会改变
    a += torch.ones((1, ))
    print(id(a), a)
2108023962648 tensor([1.])
2108023236000 tensor([2.])
2108023236000 tensor([3.])
flag = True
if flag:

    w = torch.tensor([1.], requires_grad=True)
    x = torch.tensor([2.], requires_grad=True)

    a = torch.add(w, x)
    b = torch.add(w, 1)
    y = torch.mul(a, b)

    # 叶子节点不可以执行in-place操作
#     w.add_(1)

    y.backward()
    print(w.grad)
tensor([5.])

相关标签: Pytorch