[Pytorch] spatial dropout的实现

程序员文章站 2022-07-13 10:11:15

...

dropout是神经网络中一种常用的正则化技术，其通过随机失活神经元元素，降低单元之间的相互依赖关系，从而降低过拟合的风险。实验表明，在Embedding层和CNN层后直接使用常规的dropout策略，效果并不显著，其原因可能：完全随机的dropout的无序性有损于神经元间的空间关联性，从而降低其捕捉特征的能力。因此学者们提出了一种在某些轴上完全dropout的策略，即spatial dropout。

以Embedding层（张量维度为batch*timesteps*embedding）后的dropout为例，一般的dropout是在所有维度元素上的随机选择。而通过spatial dropout，我们可以实现在指定的timesteps或者embedding方向上的统一dropout，前者实现了在某些embedding channel上的drop，而后者实现了在某些token上的drop。

pytorch并未提供直接的spatial dropout接口，本文参照keras中dropout，实现了该接口：

import torch.nn as nn
from itertools import repeat

class SpatialDropout(nn.Module):
    """
    空间dropout，即在指定轴方向上进行dropout，常用于Embedding层和CNN层后
    如对于(batch, timesteps, embedding)的输入，若沿着axis=1则可对embedding的若干channel进行整体dropout
    若沿着axis=2则可对某些token进行整体dropout
    """
    def __init__(self, drop=0.5):
        super(SpatialDropout, self).__init__()
        self.drop = drop
        
    def forward(self, inputs, noise_shape=None):
        """
        @param: inputs, tensor
        @param: noise_shape, tuple, 应当与inputs的shape一致，其中值为1的即沿着drop的轴
        """
        outputs = inputs.clone()
        if noise_shape is None:
            noise_shape = (inputs.shape[0], *repeat(1, inputs.dim()-2), inputs.shape[-1])   # 默认沿着中间所有的shape
        
        self.noise_shape = noise_shape
        if not self.training or self.drop == 0:
            return inputs
        else:
            noises = self._make_noises(inputs)
            if self.drop == 1:
                noises.fill_(0.0)
            else:
                noises.bernoulli_(1 - self.drop).div_(1 - self.drop)
            noises = noises.expand_as(inputs)    
            outputs.mul_(noises)
            return outputs
            
    def _make_noises(self, inputs):
        return inputs.new().resize_(self.noise_shape)

【参考资料】

[Pytorch] spatial dropout的实现

TextCNN的PyTorch实现

TextCNN的PyTorch实现

图画的风格迁移StyleTransfer和Pytorch实现

PyTorch上搭建简单神经网络实现回归和分类的示例

PyTorch上实现卷积神经网络CNN的方法

pytorch cnn 识别手写的字实现自建图片数据

pytorch使用horovod多gpu训练的实现

pytorch实现建立自己的数据集(以mnist为例)

【开源计划】图像配准中常用损失函数的pytorch实现

【开源计划】图像配准中变形操作（Warp）的pytorch实现