欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

torch记录:torch.nn模块

程序员文章站 2022-06-12 17:08:42
...

Recurrent layers

class torch.nn.RNN(*args, **kwargs)

参数:
input_size – 输入x的特征数量。
hidden_size – 隐层的特征数量。
num_layers – RNN的层数。
bidirectional – 如果True,将会变成一个双向RNN,默认为False。


RNN的输入: (input, h_0)
- input (seq_len, batch, input_size): 保存输入序列特征的tensor。
h_0 (num_layers * num_directions, batch, hidden_size): 保存着初始隐状态的tensor

RNN的输出: (output, h_n)

output (seq_len, batch, hidden_size * num_directions): 保存着RNN最后一层的输出特征。
h_n (num_layers * num_directions, batch, hidden_size): 保存着最后一个时刻隐状态。


例子:


#输入x的长度是10,隐层的长度是20,RNN的层数是2层
rnn = nn.RNN(10, 20, 2)
# (seq_len, batch, input_size)
input = torch.randn(5, 3, 10)
# (num_layers * num_directions, batch, hidden_size)
h0 = torch.randn(2, 3, 20)
output, hn = rnn(input, h0)
print(output.shape) # (seq_len, batch, hidden_size * num_directions)
print(hn.shape)    # (num_layers * num_directions, batch, hidden_size)

torch.Size([5, 3, 20])
torch.Size([2, 3, 20])

同理:

class torch.nn.GRU(*args, **kwargs)
class torch.nn.RNNCell(input_size, hidden_size, bias=True, nonlinearity='tanh')[source]

另一类:

class torch.nn.RNNCell(input_size, hidden_size, bias=True, nonlinearity='tanh')

Linear layers

class torch.nn.Linear(in_features, out_features, bias=True)
Applies a linear transformation to the incoming data: y=xA^T+b

例子:

# 三维特征转化为2维特征
m = nn.Linear(3, 2)
input = torch.randn(10, 3)
output = m(input)
print(output.size())


torch.Size([10, 2])

Dropout layers

class torch.nn.Dropout(p=0.5, inplace=False)

参数:

p - 将元素置0的概率。默认值:0.5
in-place - 若设置为True,会在原地执行操作。默认值:False

形状:

输入: 任意。输入可以为任意形状。
输出: 相同。输出和输入形状相同。


例子:

m = nn.Dropout(p=0.5)
input = autograd.Variable(torch.randn(2, 2))
output = m(input)
output

tensor([[-0.0000, -2.9296],
        [ 0.0924,  0.0000]])

Sparse layers

class torch.nn.Embedding(num_embeddings, embedding_dim, padding_idx=None, max_norm=None, norm_type=2, scale_grad_by_freq=False, sparse=False, _weight=None)[s

参数:

num_embeddings (int) - 嵌入字典的大小
embedding_dim (int) - 每个嵌入向量的大小
padding_idx (int, optional) - 如果提供的话,输出遇到此下标时用零填充
max_norm (float, optional) - 如果提供的话,会重新归一化词嵌入,使它们的范数小于提供的值
norm_type (float, optional) - 对于max_norm选项计算p范数时的p
scale_grad_by_freq (boolean, optional) - 如果提供的话,会根据字典中单词频率缩放梯度

变量:

weight (Tensor) -形状为(num_embeddings, embedding_dim)的模块中可学习的权值
形状:

输入: LongTensor (N, W), N = mini-batch, W = 每个mini-batch中提取的下标数
输出: (N, W, embedding_dim)


例子:

from torch.autograd import Variable
# an Embedding module containing 10 tensors of size 3
embedding = nn.Embedding(10, 3)
# a batch of 2 samples of 4 indices each
input = Variable(torch.LongTensor([[1,2,4,5],[5,4,2,1]]))
embedding(input)

tensor([[[-0.4031,  1.8008,  1.4954],
         [ 0.3768, -0.2439,  0.9262],
         [ 0.8444, -0.1265,  2.0801],
         [ 1.0576, -0.9705, -0.1841]],

        [[ 1.0576, -0.9705, -0.1841],
         [ 0.8444, -0.1265,  2.0801],
         [ 0.3768, -0.2439,  0.9262],
         [-0.4031,  1.8008,  1.4954]]])
embedding.weight


Parameter containing:
tensor([[-0.6084,  0.0402, -1.5447],
        [-0.4031,  1.8008,  1.4954],
        [ 0.3768, -0.2439,  0.9262],
        [ 0.4351, -1.6146,  0.7603],
        [ 0.8444, -0.1265,  2.0801],
        [ 1.0576, -0.9705, -0.1841],
        [ 0.6502, -0.1189,  0.0794],
        [-0.9843, -0.1582, -0.0912],
        [ 0.1690, -0.0980, -0.1338],
        [-0.9448, -1.9642, -0.1723]])

example with padding_idx:

# example with padding_idx
embedding = nn.Embedding(10, 3, padding_idx= 1)
input = Variable(torch.LongTensor([[0,1,0,5]]))
embedding(input)

tensor([[[-1.1790,  1.2073, -1.0174],
         [ 0.0000,  0.0000,  0.0000],
         [-1.1790,  1.2073, -1.0174],
         [-0.2278,  1.1332, -0.2259]]])