Pytorch官方教程学习笔记(2)
Autograd: Automatic Differentiation(自动求微分)
Central to all neural networks in PyTorch is the autograd
package.
Let’s first briefly visit this, and we will then go to training our
first neural network.
autograd
包对Tensors的所有操作都提供微分操作。采用的是运行时定义框架,意味着,反向传播操作取决于代码是如何运行的。
The autograd
package provides automatic differentiation for all operations
on Tensors. It is a define-by-run framework, which means that your backprop is
defined by how your code is run, and that every single iteration can be
different.
Let us see this in more simple terms with some examples.
Tensor
将一个torch.Tensor
的.requires_grad
设置为True
,那么该张量上的所有操作
将被追踪。在计算完成后,调用.backward()
方法将进行梯度的计算。该张量上的梯度将
被存入.grad
属性中。
调用.detach()
方法将一个张量从计算历史中移除,该张量后面的计算将不会被追踪。
同样,也可以使用上下文管理器with torch.no_grad():
阻止对某张量进行追踪(以及相关
内存的使用)。这种方法适用于对模型进行评估的过程,模型中存在可训练张量的requires_grad=True
,
但在评估过程中不需要进行梯度的计算。
There’s one more class which is very important for autograd
implementation - aFunction
.
每一个张量都存在一个.grad_fn
属性,该属性指明了产生该张量的函数类型(由程序员创造的张量的这一属性为None
)。
调用.backward()
方法计算Tensor
的梯度,当张量为标量时,不需要对.backward()
的参数
进行指定,否则,需要指定梯度参数gradient
,该参数具有匹配的形状。
import torch
Create a tensor and set requires_grad=True to track computation with it
x = torch.ones(2, 2, requires_grad=True)
print(x)
tensor([[1., 1.],
[1., 1.]], requires_grad=True)
Do an operation of tensor:
y = x + 2
print(y)
tensor([[3., 3.],
[3., 3.]], grad_fn=<AddBackward>)
y
was created as a result of an operation, so it has a grad_fn
.
print(y.grad_fn)
<AddBackward object at 0x000001C98CA1A748>
Do more operations on y
z = y * y * 3
out = z.mean()
print(z, out)
tensor([[27., 27.],
[27., 27.]], grad_fn=<MulBackward>) tensor(27., grad_fn=<MeanBackward1>)
.requires_grad_( ... )
changes an existing Tensor’s requires_grad
flag in-place. The input flag defaults to False
if not given.
a = torch.randn(2, 2)
a = ((a * 3) / (a - 1))
print(a.requires_grad)
a.requires_grad_(True)
print(a.requires_grad)
b = (a * a).sum()
print(b.grad_fn)
print(b)
False
True
<SumBackward0 object at 0x000001C98CA1AE10>
tensor(4.0653, grad_fn=<SumBackward0>)
Gradients
Let’s backprop now
Because out
contains a single scalar, out.backward()
is
equivalent to out.backward(torch.tensor(1))
.
out.backward()# 调用.backward()方法计算梯度,out为标量,所以不需要指定参数
print gradients d(out)/dx
print("out关于x的梯度为:\n", x.grad)
out关于x的梯度为:
tensor([[4.5000, 4.5000],
[4.5000, 4.5000]])
You should have got a matrix of 4.5
. Let’s call the out
Tensor “”.
We have that ,
and .
Therefore,
, hence
.
You can do many crazy things with autograd!
x = torch.randn(3, requires_grad=True)
y = x * 2
while y.data.norm() < 1000:
y = y * 2
print(y)
gradients = torch.tensor([0.1, 1.0, 0.0001], dtype=torch.float)# y为张量,需要对gradients参数进行指定
y.backward(gradients)
print(x.grad)
# 需要对.backward()方法的gradient进行指定的例子
x1 = torch.ones([3, 3], requires_grad=True)# 这里需要将requires_grad属性设置为真,不然无法对该张量进行梯度计算,并且不能加入运算符
print(x1)
tensor([[1., 1., 1.],
[1., 1., 1.],
[1., 1., 1.]], requires_grad=True)
y1 = x1 ** 2
print(y1)
print(y1.grad_fn)
tensor([[1., 1., 1.],
[1., 1., 1.],
[1., 1., 1.]], grad_fn=<PowBackward0>)
<PowBackward0 object at 0x000001C98CA27978>
gradients1 = torch.ones([3, 3]) * 3
print(gradients1)
y1.backward(gradients1)
print(x1.grad)
tensor([[3., 3., 3.],
[3., 3., 3.],
[3., 3., 3.]])
tensor([[6., 6., 6.],
[6., 6., 6.],
[6., 6., 6.]])
y1关于x1的梯度公式为2x1,当输入为3时,梯度为6
You can also stop autograd from tracking history on Tensors
with .requires_grad=True
by wrapping the code block inwith torch.no_grad()
:
print(x.requires_grad)
print((x ** 2).requires_grad)
with torch.no_grad():
print((x ** 2).requires_grad)
上一篇: 【Python自动化运维】远程备份数据库并下载到本地
下一篇: 8月美国主机优惠码汇总