torch.embedding and EmbeddingBag 详解
程序员文章站
2022-06-13 15:51:47
...
Embedding
torch.embedding 实际上是一个查找表,一般用来存储词嵌入并通过indices从embedding中恢复词嵌入。
位置:
torch.nn.Embedding
参数及官方解释为:
-
num_embeddings (int)
: size of the dictionary of embeddings -
embedding_dim (int)
: the size of each embedding vector -
padding_idx (int, optional)
:If given, pads the output with the embedding vector atpadding_idx
(initialized to zeros) whenever it encounters the index. -
max_norm ((float, optional))
:If given, each embedding vector with norm larger thanmax_norm
is renormalized to have normmax_norm
. -
norm_type (float, optional)
: The p of the p-norm to compute for themax_norm
option. Default2
. -
scale_grad_by_freq
: If given, this will scale gradients by the inverse of frequency of the words in the mini-batch. DefaultFalse
. -
sparse (bool, optional)
:IfTrue
, gradient w.r.t.weight
matrix will be a sparse tensor. See Notes for more details regarding sparse gradients.
Attributes:
-
weight (Tensor)
: the learnable weights of the module of shape (num_embeddings, embedding_dim) initialized from :math:\mathcal{N}(0, 1)
Shape:
- Input: :math:
(*)
, LongTensor of arbitrary shape containing the indices to extract - Output: :math:
(*, H)
, where*
is the input shape and :math:H=\text{embedding\_dim}
Examples::
>>> # an Embedding module containing 10 tensors of size 3
>>> embedding = nn.Embedding(10, 3)
>>> # a batch of 2 samples of 4 indices each
>>> input = torch.LongTensor([[1,2,4,5],[4,3,2,9]])
>>> embedding(input)
tensor([[[-0.0251, -1.6902, 0.7172],
[-0.6431, 0.0748, 0.6969],
[ 1.4970, 1.3448, -0.9685],
[-0.3677, -2.7265, -0.1685]],
[[ 1.4970, 1.3448, -0.9685],
[ 0.4362, -0.4004, 0.9400],
[-0.6431, 0.0748, 0.6969],
[ 0.9124, -2.3616, 1.1151]]])
可以看到当index 相同的时候输出的 embedding是相同的。例如第一个sample的index=2 和第二个sample的index=2,
也就是说对于同一个embedding, 输入的index相同,对应的tensor相同。
with padding
>>> # example with padding_idx
>>> embedding = nn.Embedding(10, 3, padding_idx=0)
>>> input = torch.LongTensor([[0,2,0,5]])
>>> embedding(input)
tensor([[[ 0.0000, 0.0000, 0.0000],
[ 0.1535, -2.0309, 0.9315],
[ 0.0000, 0.0000, 0.0000],
[-0.1655, 0.9897, 0.0635]]])
当有padding的时候,例如设置padding_idx = 0,也就是当第一个index 和第三个index = 0时,输出的tensor 自动padding为0,而index=2和index=5没有设置padding,所以输出没有被0 padding。
其中的一个classmethod
@classmethod
def from_pretrained(cls, embeddings, freeze=True, padding_idx=None,
max_norm=None, norm_type=2., scale_grad_by_freq=False, sparse=False):
r"""Creates Embedding instance from given 2-dimensional FloatTensor.
Args:
embeddings (Tensor): FloatTensor containing weights for the Embedding.
First dimension is being passed to Embedding as ``num_embeddings``, second as ``embedding_dim``.
freeze (boolean, optional): If ``True``, the tensor does not get updated in the learning process.
Equivalent to ``embedding.weight.requires_grad = False``. Default: ``True``
padding_idx (int, optional): See module initialization documentation.
max_norm (float, optional): See module initialization documentation.
norm_type (float, optional): See module initialization documentation. Default ``2``.
scale_grad_by_freq (boolean, optional): See module initialization documentation. Default ``False``.
sparse (bool, optional): See module initialization documentation.
Examples::
>>> # FloatTensor containing pretrained weights
>>> weight = torch.FloatTensor([[1, 2.3, 3], [4, 5.1, 6.3]])
>>> embedding = nn.Embedding.from_pretrained(weight)
>>> # Get embeddings for index 1
>>> input = torch.LongTensor([1])
>>> embedding(input)
tensor([[ 4.0000, 5.1000, 6.3000]])
"""
EmbeddingBag
Computes sums or means of ‘bags’ of embeddings, without instantiating the intermediate embeddings.
支持的三种mode
-
sum
:is equivalent to~torch.nn.Embedding
followed bytorch.sum(dim=0)
-
mean
:is equivalent to~torch.nn.Embedding
followed bytorch.mean(dim=0)
-
max
:is equivalent to~torch.nn.Embedding
followed bytorch.max(dim=0)
但是用embeddingbag 的效率会更高。
pytorch支持在forward pass 中增加 per-sample weights,但只在 mode == sum时支持。如果这个参数为0,在计算 weighted sum的时候所有的weight = 1,如果不为0,则按照设置的weight来计算weighted sum。
其他参数和 embedding差不多