计算机科学 ›› 2020, Vol. 47 ›› Issue (4): 204-210.doi: 10.11896/jsjkx.190100097
余珊珊1, 苏锦钿2, 李鹏飞2
YU Shan-shan1, SU Jin-dian2, LI Peng-fei2
摘要: 注意力机制近年来在多个自然语言任务中得到广泛应用,但在句子级别的情感分类任务中仍缺乏相应的研究。文中利用自注意力在学习句子中重要局部特征方面的优势,结合长短期记忆网络(Long Short-Term Model,LSTM),提出了一种基于注意力机制的神经网络模型(Attentional LSTM,AttLSTM),并将其应用于句子的情感分类。AttLSTM首先通过LSTM学习句子中词的上文信息;接着利用自注意力函数从句子中学习词的位置信息,并构造相应的位置权重向量矩阵;然后通过加权平均得到句子的最终语义表示;最后利用多层感知器进行分类和输出。实验结果表明,AttLSTM在公开的二元情感分类语料库Movie Reviews(MR),Stanford Sentiment Treebank(SSTb2)和Internet Movie Database(IMDB)上的准确率最高,分别为82.8%,88.3%和91.3%;在多元情感分类语料库SSTb5上取得50.6%的准确率。
中图分类号:
[1]WAGN X,LIU Y C,SUNET C J,et al.Predicting Polarities of Tweets by Composing Word Embeddings with Long Short-Term Memory[C]//Proc.of the 53rd Annual Meeting of the Ass.for Computation Linguistics and the 7th Int.Joint Conf.on Natural Language Processing.Stroudsburg:Association for Computational Linguistics,2015:1343-1353. [2]TAI K S,SOCHER R,MANNING C D.Improved SemanticRepresentations from Tree Structured Long Short-term Memory Networks[C]//Proc.of the 53rd Annual Meeting of the Ass.for Computational Linguistics and the 7th Int.Joint Conf.on Natural Language Processing.Stroudsburg:Association for Computational Linguistics,2015:1556-1566. [3]SOCHER R,PERELYGIN A,WU J,et al.Recursive Deep Models for Semantic Compositionality over a Sentiment Treebank[C]//Proc.of the 2013 Conf.on Empirical Methods in Natural Language Processing.California:Stanford Press,2013:1631-1642. [4]BAHDANAU D,CHO K Y,BENGIO Y.Neural MachineTranslation By Jointly Learning to Align and Translate [C]//ICLR 2015.New York:Cornell University Press,2015. [5]LING W,TSVEYKOV Y,AMIR S,et al.Not All Contexts Are Created Equal:Better Word Representations with Variable Attention[C]//Conf.on Empirical Methods in Natural Language Processing.Stroudsburg:Ass.for Computational Linguistics,2015:1367-1372. [6]LUONG M T,PHAM H,MANNING C D.Effective Approaches to Attention-based Neural Machine Translation[C]//EMNLP 2015.Stroudsburg:Association for Computational Linguistics,2015:1412. [7]YANG Z C,YANG D Y,DYER C,et al.Hierarchical Attention Networks for Document Classification[C]//Association for Computational Linguistics,NACCL 2016.Stroudsburg,2017:1480-1489. [8]PAULUS R,XIONG C M,SOCHER R.A Deep ReinforcedModel for Abstractive Summarization[C]//International Conference on Learning Representations (ICLR 2018).2017. [9]LI L F,NIE Y P,HAN W H,et al.A Multi-attention-Based Bidirectional Long Short-Term Memory Network for Relation Extraction[C]//ICONIP 2017.Berlin:Springer,2017:216-227. [10]CHENG J P,LI D,LAPATA M.Long Short-Term MemoryNetworks for Machine Reading[C]//Association for Computational Linguistics,EMNLP 2016.Stroudsburg,2016:551-561. [11]PARIKH A,TACKSTROM O,DAS D,et al.A Decomposable Attention Model for Natural Language Inference[C]//Association for Computational Linguistics,EMNLP 2016.Stroudsburg,2016:2249-2255. [12]LIN Z H,FENG M W,SANTOS C N,et al.A Structured Self-Attentive Sentence Embedding[C]//ICLR 2017.New York:Cornell University Press,2017. [13]SHEN T,JIANG J,ZHOU T Y,et al.DiSAN:Directional Self-Attention Network for RNN/CNN-Free Language Understan-ding[C]//AAAI-18.2018:5446-5455. [14]SRIVASTAVA N,HINTON G,KRIZHEVSKV A.Dropout:A Simple Way to Prevent Neural Networks from Overfitting[J].Journal of Machine Learning Research,2014,15(1):1929-1958. [15]PANG B,LEE L.Seeing Starts:Exploiting Class Relationshipsfor Sentiment Categorization with Respect to Rating Scales[C]//ACL 2005.NY:ACM Press,2005:115-124. [16]MAAS A L,DALY R E,PHAM P T,et al.Learning Word Vectors for Sentiment Analysis[C]//The 49th Annual Meeting of the Ass.for Computational Linguistics.Stroudsburg:Association for Computational Linguistics,2011:142-150. [17]JOULIN A,GRAVE E,BOJANOWSKI P,et al.Bag of Tricks for Efficient Text Classification[C]//Association for Computational Linguistics,EACL 2017.Stroudsburg.2017:427-431. [18]LE Q V,MIKOLOV T.Distributed Representations of Sentences and Documents[C]//31st International Conference on Machine Learning.Beijing:International Machine LearningSo-ciety,2014:1188-1196. [19]NGUYEN D Q,VU T,PHAM S B.Sentiment Classification on Polarity Reviews:An Empirical Study Using Rating-based Features[C]//Proc.of the 5th Workshop on Computational Approaches to Subjectivity,Sentiment and Social Media Analysis.Stroudsburg:Association for Computational Linguistics,2014:128-135. [20]KIM Y.Convolutional Neural Networks for Sentence Classification[C]//Association for Computational Linguistics,EMNLP 2014.Stroudsburg,2014:1746-1751. |
[1] | 徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺. 时序知识图谱表示学习 Temporal Knowledge Graph Representation Learning 计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204 |
[2] | 饶志双, 贾真, 张凡, 李天瑞. 基于Key-Value关联记忆网络的知识图谱问答方法 Key-Value Relational Memory Networks for Question Answering over Knowledge Graph 计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277 |
[3] | 吴子仪, 李邵梅, 姜梦函, 张建朋. 基于自注意力模型的本体对齐方法 Ontology Alignment Method Based on Self-attention 计算机科学, 2022, 49(9): 215-220. https://doi.org/10.11896/jsjkx.210700190 |
[4] | 汤凌韬, 王迪, 张鲁飞, 刘盛云. 基于安全多方计算和差分隐私的联邦学习方案 Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy 计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108 |
[5] | 王剑, 彭雨琦, 赵宇斐, 杨健. 基于深度学习的社交网络舆情信息抽取方法综述 Survey of Social Network Public Opinion Information Extraction Based on Deep Learning 计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099 |
[6] | 郝志荣, 陈龙, 黄嘉成. 面向文本分类的类别区分式通用对抗攻击方法 Class Discriminative Universal Adversarial Attack for Text Classification 计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077 |
[7] | 姜梦函, 李邵梅, 郑洪浩, 张建朋. 基于改进位置编码的谣言检测模型 Rumor Detection Model Based on Improved Position Embedding 计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046 |
[8] | 方义秋, 张震坤, 葛君伟. 基于自注意力机制和迁移学习的跨领域推荐算法 Cross-domain Recommendation Algorithm Based on Self-attention Mechanism and Transfer Learning 计算机科学, 2022, 49(8): 70-77. https://doi.org/10.11896/jsjkx.210600011 |
[9] | 陈坤峰, 潘志松, 王家宝, 施蕾, 张锦. 基于双目叠加仿生的微换衣行人再识别 Moderate Clothes-Changing Person Re-identification Based on Bionics of Binocular Summation 计算机科学, 2022, 49(8): 165-171. https://doi.org/10.11896/jsjkx.210600140 |
[10] | 孙奇, 吉根林, 张杰. 基于非局部注意力生成对抗网络的视频异常事件检测方法 Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection 计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061 |
[11] | 闫佳丹, 贾彩燕. 基于双图神经网络信息融合的文本分类方法 Text Classification Method Based on Information Fusion of Dual-graph Neural Network 计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042 |
[12] | 胡艳羽, 赵龙, 董祥军. 一种用于癌症分类的两阶段深度特征选择提取算法 Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification 计算机科学, 2022, 49(7): 73-78. https://doi.org/10.11896/jsjkx.210500092 |
[13] | 程成, 降爱莲. 基于多路径特征提取的实时语义分割方法 Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction 计算机科学, 2022, 49(7): 120-126. https://doi.org/10.11896/jsjkx.210500157 |
[14] | 侯钰涛, 阿布都克力木·阿布力孜, 哈里旦木·阿布都克里木. 中文预训练模型研究进展 Advances in Chinese Pre-training Models 计算机科学, 2022, 49(7): 148-163. https://doi.org/10.11896/jsjkx.211200018 |
[15] | 周慧, 施皓晨, 屠要峰, 黄圣君. 基于主动采样的深度鲁棒神经网络学习 Robust Deep Neural Network Learning Based on Active Sampling 计算机科学, 2022, 49(7): 164-169. https://doi.org/10.11896/jsjkx.210600044 |
|