Aspect level Sentiment Classification with HEAT ( HiErarchical ATtention )

程序员文章站 2022-05-30 18:28:46

...

这篇文章主要记录对**Aspect level Sentiment Classification with HEAT ( HiErarchical ATtention )**论文的理解，主要说明其模型。
该模型提出了一个双层的Attention网络基于aspect word做分类，双层的Attention首先从句子中学习aspect信息，然后基于aspect和从句子中提取的aspect信息，关注特定的情感信息。如句子：
Aspect level Sentiment Classification with HEAT ( HiErarchical ATtention )
给定aspect词food，双层的Attention模型首先基于“food”关注单词“tastes”（aspect terms），之后基于aspect词"food"和“tastes”，找到词"great"。这样基于aspect terms，能更好的确定给定aspect的情感倾向。

一 Model

1.1 HEAT 网络结构

其结构图如下所示：
Aspect level Sentiment Classification with HEAT ( HiErarchical ATtention )
Input Model：输入模块将句子和aspect词编码为向量的形式
Hierarchical Attention Model：使用两层attenton获取aspect information（aspect attention层）和aspect-specfic sentiment information（sentiment attention层）
Sentiment Classfication Model：情感分类

1.2 Input Model

使用双向GRU模型学习句子的向量表示，其主要定义如下：
Aspect level Sentiment Classification with HEAT ( HiErarchical ATtention )
我们令：

1.3 Hierarchical Attention Model

Aspect Attention
Aspect Attention找到可能的aspect terms，其输入是
Aspect level Sentiment Classification with HEAT ( HiErarchical ATtention )
attention机制基于给定的aspect表示和句子的特征表示计算每个词的权重：

故最终句子的aspect information是对特征的权重累加：

Sentiment attention
Sentiment attention基于aspect词和aspect information提取句子的情感特征。与aspect attention类似，其输入是BiGRU的输出
Aspect level Sentiment Classification with HEAT ( HiErarchical ATtention )
由于aspect information和sentiment information需要不同的特征，所以这两个GRU模型不共享参数。
之后基于句子的特征向量、aspect特征以及句子的aspect特征计算每个词的attention分数：
Aspect level Sentiment Classification with HEAT ( HiErarchical ATtention )
为了更好的计算attention权重，文章中考虑了aspect terms的局部信息（离aspect terms更近的情感词比远的要更重要）。使用location mask layer关注aspect terms的局部信息。用一个局部矩阵来实现：
Aspect level Sentiment Classification with HEAT ( HiErarchical ATtention )
这样离aspect term更近的词会有更大的权重，故sentiment attention分数计算为：

基于给定aspect句子的情感特征是句子特征的权重累加：

1.4 Setiment Classfication Model

Aspect level Sentiment Classification with HEAT ( HiErarchical ATtention )

二核心代码

class HEAT(nn.Module):
    def __init__(self, word_embed_dim, output_size, vocab_size, aspect_size, args=None):
        super(HEAT, self).__init__()

        self.input_size = word_embed_dim if (args.use_elmo == 0) else ( word_embed_dim + 1024 if args.use_elmo == 1 else 1024)
        self.hidden_size = args.n_hidden
        self.output_size = output_size
        self.max_length = 1
        self.lr = 0.0005

        self.word_rep = WordRep(vocab_size, word_embed_dim, None, args)
        self.rnn_a = nn.GRU(self.input_size, self.hidden_size // 2, bidirectional=True)
        self.AE = nn.Embedding(aspect_size, word_embed_dim)

        self.W_h_a = nn.Linear(self.hidden_size, self.hidden_size)
        self.W_v_a = nn.Linear(word_embed_dim, self.input_size)
        self.w_a = nn.Linear(self.hidden_size + word_embed_dim, 1)
        self.W_p_a = nn.Linear(self.hidden_size, self.hidden_size)
        self.W_x_a = nn.Linear(self.hidden_size, self.hidden_size)

        self.rnn_p = nn.GRU(self.input_size, self.hidden_size // 2, bidirectional=True)

        self.W_h = nn.Linear(self.hidden_size, self.hidden_size)
        self.W_v = nn.Linear(word_embed_dim+self.hidden_size, word_embed_dim+self.hidden_size)
        self.w = nn.Linear(2*self.hidden_size + word_embed_dim, 1)
        self.W_p = nn.Linear(self.hidden_size, self.hidden_size)
        self.W_x = nn.Linear(self.hidden_size, self.hidden_size)

        self.decoder_p = nn.Linear(self.hidden_size+word_embed_dim, output_size)  
        self.dropout = nn.Dropout(args.dropout)
        self.optimizer = torch.optim.Adam(self.parameters(), lr=self.lr)

    def forward(self, input_tensors):
        assert len(input_tensors) == 3
        aspect_i = input_tensors[2]
        #得到句子的特征表示
        sentence = self.word_rep(input_tensors)
        #句子的长度
        length = sentence.size()[0]
        #两个GRU:一个用于Aspect attention;一个用于Sentiment attention
        output_a, hidden = self.rnn_a(sentence)
        output_p, _ = self.rnn_p(sentence)
        #[length,128]
        output_a = output_a.view(output_a.size()[0], -1)
        output_p = output_p.view(length, -1)
     
        #主题词的特征向量表示[1,200]
        aspect_e = self.AE(aspect_i)
        aspect_embedding = aspect_e.view(1, -1)
        
        #[length,200]把主题词扩大成句子的向量
        aspect_embedding = aspect_embedding.expand(length, -1)
        #得到aspect对于句子中每一词的权重[length,428]
        M_a = F.tanh(torch.cat((output_a, aspect_embedding), dim=1))
        #[1,length]
        weights_a = F.softmax(self.w_a(M_a), dim=0).t()
        # 得到基于主题词的句子aspect information[1,128]
        r_a = torch.matmul(weights_a, output_a)
        
        #sentiment attention
        #[length,128]
        r_a_expand = r_a.expand(length, -1)

        #[length,328]
        query4PA = torch.cat((r_a_expand, aspect_embedding), dim=1)

        #[length,456]
        M_p = F.tanh(torch.cat((output_p, query4PA), dim=1))
        #[length,1]
        g_p = self.w(M_p)
        # print(g_p)

        weights_p = F.softmax(g_p, dim=0).t()

        #sentiment feature
        r_p = torch.matmul(weights_p, output_p)
        r = torch.cat((r_p, aspect_e), dim=1)

        #输出
        decoded = self.decoder_p(r)
        ouput = decoded
        return ouput

相关标签：文本分类 Aspect sentiment classfication nlp

上一篇：蜀汉为什么会有永安都督这个职位？原因是什么

下一篇：当初跟随姜维投降的三员大将中谁的结局是最好的

Aspect level Sentiment Classification with HEAT ( HiErarchical ATtention )

一 Model

1.1 HEAT 网络结构

1.2 Input Model

1.3 Hierarchical Attention Model

1.4 Setiment Classfication Model

二 核心代码

Aspect level Sentiment Classification with HEAT ( HiErarchical ATtention )

Text Classification -- Convolutional Networks、sentence level Attentional RNN、Hierarchical attention

二核心代码