计算机科学 ›› 2020, Vol. 47 ›› Issue (4): 204-210.doi: 10.11896/jsjkx.190100097

• 人工智能 • 上一篇    下一篇

一种基于自注意力的句子情感分类方法

余珊珊1, 苏锦钿2, 李鹏飞2   

  1. 1 广东药科大学医药信息工程学院 广州510006;
    2 华南理工大学计算机科学与工程学院 广州510640
  • 收稿日期:2019-01-13 出版日期:2020-04-15 发布日期:2020-04-15
  • 通讯作者: 苏锦钿(SuJD@scut.edu.cn)
  • 基金资助:
    广东省自然科学基金(2015A030310318);广东省科技厅应用型科技研发专项资金(20168010124010);广东省医学科学技术研究基金项目(A2015065)

Sentiment Classification Method for Sentences via Self-attention

YU Shan-shan1, SU Jin-dian2, LI Peng-fei2   

  1. 1 College of Medical Information Engineering,Guangdong Pharmaceutical University,Guangzhou 510006,China;
    2 College of Computer Science and Engineering,South China University of Technology,Guangzhou 510640,China
  • Received:2019-01-13 Online:2020-04-15 Published:2020-04-15
  • Contact: SU Jin-dian,born in 1980,Ph.D,asso-ciate professor.His main research inte-rests include natural language proces-sing,artifical intelligence,machine lear-ning.
  • About author:YU Shan-shan,born in 1980,Ph.D,is senior member of China Computer Fede-ration.Her main research interests include machine learning,big data and semantic Web.
  • Supported by:
    This work was supported bythe Natural Science Foundation of Guangdong Province (2015A030310318),Applied Scientific and Technology Special Project of Department of Science and Technology of Guangdong Province (20168010124010),and Medical Scientific Research Foundation of Guangdong Province(A2015065)

摘要: 注意力机制近年来在多个自然语言任务中得到广泛应用,但在句子级别的情感分类任务中仍缺乏相应的研究。文中利用自注意力在学习句子中重要局部特征方面的优势,结合长短期记忆网络(Long Short-Term Model,LSTM),提出了一种基于注意力机制的神经网络模型(Attentional LSTM,AttLSTM),并将其应用于句子的情感分类。AttLSTM首先通过LSTM学习句子中词的上文信息;接着利用自注意力函数从句子中学习词的位置信息,并构造相应的位置权重向量矩阵;然后通过加权平均得到句子的最终语义表示;最后利用多层感知器进行分类和输出。实验结果表明,AttLSTM在公开的二元情感分类语料库Movie Reviews(MR),Stanford Sentiment Treebank(SSTb2)和Internet Movie Database(IMDB)上的准确率最高,分别为82.8%,88.3%和91.3%;在多元情感分类语料库SSTb5上取得50.6%的准确率。

关键词: 长短期记忆神经网络, 情感分类, 深度学习, 自然语言处理, 自注意力

Abstract: Although attention mechanisms are widely used in many natural language processing tasks,there still lacks of related works about its applications in sentence-level sentiment classification.By taking advantage of self-attention mechanism in learning important local features of sentences,a multi-layer attentional neural network based on long-short term memory network (LSTM) and attention mechanism,named AttLSTM,was proposed and then applied into the fields of sentiment classification for sentences.AttLSTM firstly uses LSTM network to capture the contexts of sentences,and then takes self-attention functions to learn the position information about words in the sentences and builds the corresponding position weight matrix,which yields the final semantic representations of the sentences by weighted averaging.Finally,the results is classified and outputted via a multi-layer perceptron.The experiment results show that AttLSTM outperforms some relative works and achieves the highest accuracy of 82.8%,88.3% and 91.3% respectively on open two-class sentiment classification corpora,including Movie Reviews (MR),Stanford Sentiment Treebank (SSTb2) and Internet Movie Database (IMDB),as well as 50.6% for multi-class classification corpora SSTb5.

Key words: Deep learning, Long-short term memory, Natural language processing, Self-attention, Sentiment classification

中图分类号: 

  • TP183
[1]WAGN X,LIU Y C,SUNET C J,et al.Predicting Polarities of Tweets by Composing Word Embeddings with Long Short-Term Memory[C]//Proc.of the 53rd Annual Meeting of the Ass.for Computation Linguistics and the 7th Int.Joint Conf.on Natural Language Processing.Stroudsburg:Association for Computational Linguistics,2015:1343-1353.
[2]TAI K S,SOCHER R,MANNING C D.Improved SemanticRepresentations from Tree Structured Long Short-term Memory Networks[C]//Proc.of the 53rd Annual Meeting of the Ass.for Computational Linguistics and the 7th Int.Joint Conf.on Natural Language Processing.Stroudsburg:Association for Computational Linguistics,2015:1556-1566.
[3]SOCHER R,PERELYGIN A,WU J,et al.Recursive Deep Models for Semantic Compositionality over a Sentiment Treebank[C]//Proc.of the 2013 Conf.on Empirical Methods in Natural Language Processing.California:Stanford Press,2013:1631-1642.
[4]BAHDANAU D,CHO K Y,BENGIO Y.Neural MachineTranslation By Jointly Learning to Align and Translate [C]//ICLR 2015.New York:Cornell University Press,2015.
[5]LING W,TSVEYKOV Y,AMIR S,et al.Not All Contexts Are Created Equal:Better Word Representations with Variable Attention[C]//Conf.on Empirical Methods in Natural Language Processing.Stroudsburg:Ass.for Computational Linguistics,2015:1367-1372.
[6]LUONG M T,PHAM H,MANNING C D.Effective Approaches to Attention-based Neural Machine Translation[C]//EMNLP 2015.Stroudsburg:Association for Computational Linguistics,2015:1412.
[7]YANG Z C,YANG D Y,DYER C,et al.Hierarchical Attention Networks for Document Classification[C]//Association for Computational Linguistics,NACCL 2016.Stroudsburg,2017:1480-1489.
[8]PAULUS R,XIONG C M,SOCHER R.A Deep ReinforcedModel for Abstractive Summarization[C]//International Conference on Learning Representations (ICLR 2018).2017.
[9]LI L F,NIE Y P,HAN W H,et al.A Multi-attention-Based Bidirectional Long Short-Term Memory Network for Relation Extraction[C]//ICONIP 2017.Berlin:Springer,2017:216-227.
[10]CHENG J P,LI D,LAPATA M.Long Short-Term MemoryNetworks for Machine Reading[C]//Association for Computational Linguistics,EMNLP 2016.Stroudsburg,2016:551-561.
[11]PARIKH A,TACKSTROM O,DAS D,et al.A Decomposable Attention Model for Natural Language Inference[C]//Association for Computational Linguistics,EMNLP 2016.Stroudsburg,2016:2249-2255.
[12]LIN Z H,FENG M W,SANTOS C N,et al.A Structured Self-Attentive Sentence Embedding[C]//ICLR 2017.New York:Cornell University Press,2017.
[13]SHEN T,JIANG J,ZHOU T Y,et al.DiSAN:Directional Self-Attention Network for RNN/CNN-Free Language Understan-ding[C]//AAAI-18.2018:5446-5455.
[14]SRIVASTAVA N,HINTON G,KRIZHEVSKV A.Dropout:A Simple Way to Prevent Neural Networks from Overfitting[J].Journal of Machine Learning Research,2014,15(1):1929-1958.
[15]PANG B,LEE L.Seeing Starts:Exploiting Class Relationshipsfor Sentiment Categorization with Respect to Rating Scales[C]//ACL 2005.NY:ACM Press,2005:115-124.
[16]MAAS A L,DALY R E,PHAM P T,et al.Learning Word Vectors for Sentiment Analysis[C]//The 49th Annual Meeting of the Ass.for Computational Linguistics.Stroudsburg:Association for Computational Linguistics,2011:142-150.
[17]JOULIN A,GRAVE E,BOJANOWSKI P,et al.Bag of Tricks for Efficient Text Classification[C]//Association for Computational Linguistics,EACL 2017.Stroudsburg.2017:427-431.
[18]LE Q V,MIKOLOV T.Distributed Representations of Sentences and Documents[C]//31st International Conference on Machine Learning.Beijing:International Machine LearningSo-ciety,2014:1188-1196.
[19]NGUYEN D Q,VU T,PHAM S B.Sentiment Classification on Polarity Reviews:An Empirical Study Using Rating-based Features[C]//Proc.of the 5th Workshop on Computational Approaches to Subjectivity,Sentiment and Social Media Analysis.Stroudsburg:Association for Computational Linguistics,2014:128-135.
[20]KIM Y.Convolutional Neural Networks for Sentence Classification[C]//Association for Computational Linguistics,EMNLP 2014.Stroudsburg,2014:1746-1751.
[1] 徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺.
时序知识图谱表示学习
Temporal Knowledge Graph Representation Learning
计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204
[2] 饶志双, 贾真, 张凡, 李天瑞.
基于Key-Value关联记忆网络的知识图谱问答方法
Key-Value Relational Memory Networks for Question Answering over Knowledge Graph
计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[3] 吴子仪, 李邵梅, 姜梦函, 张建朋.
基于自注意力模型的本体对齐方法
Ontology Alignment Method Based on Self-attention
计算机科学, 2022, 49(9): 215-220. https://doi.org/10.11896/jsjkx.210700190
[4] 汤凌韬, 王迪, 张鲁飞, 刘盛云.
基于安全多方计算和差分隐私的联邦学习方案
Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy
计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108
[5] 王剑, 彭雨琦, 赵宇斐, 杨健.
基于深度学习的社交网络舆情信息抽取方法综述
Survey of Social Network Public Opinion Information Extraction Based on Deep Learning
计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099
[6] 郝志荣, 陈龙, 黄嘉成.
面向文本分类的类别区分式通用对抗攻击方法
Class Discriminative Universal Adversarial Attack for Text Classification
计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077
[7] 姜梦函, 李邵梅, 郑洪浩, 张建朋.
基于改进位置编码的谣言检测模型
Rumor Detection Model Based on Improved Position Embedding
计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046
[8] 方义秋, 张震坤, 葛君伟.
基于自注意力机制和迁移学习的跨领域推荐算法
Cross-domain Recommendation Algorithm Based on Self-attention Mechanism and Transfer Learning
计算机科学, 2022, 49(8): 70-77. https://doi.org/10.11896/jsjkx.210600011
[9] 陈坤峰, 潘志松, 王家宝, 施蕾, 张锦.
基于双目叠加仿生的微换衣行人再识别
Moderate Clothes-Changing Person Re-identification Based on Bionics of Binocular Summation
计算机科学, 2022, 49(8): 165-171. https://doi.org/10.11896/jsjkx.210600140
[10] 孙奇, 吉根林, 张杰.
基于非局部注意力生成对抗网络的视频异常事件检测方法
Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection
计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[11] 闫佳丹, 贾彩燕.
基于双图神经网络信息融合的文本分类方法
Text Classification Method Based on Information Fusion of Dual-graph Neural Network
计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042
[12] 胡艳羽, 赵龙, 董祥军.
一种用于癌症分类的两阶段深度特征选择提取算法
Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification
计算机科学, 2022, 49(7): 73-78. https://doi.org/10.11896/jsjkx.210500092
[13] 程成, 降爱莲.
基于多路径特征提取的实时语义分割方法
Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction
计算机科学, 2022, 49(7): 120-126. https://doi.org/10.11896/jsjkx.210500157
[14] 侯钰涛, 阿布都克力木·阿布力孜, 哈里旦木·阿布都克里木.
中文预训练模型研究进展
Advances in Chinese Pre-training Models
计算机科学, 2022, 49(7): 148-163. https://doi.org/10.11896/jsjkx.211200018
[15] 周慧, 施皓晨, 屠要峰, 黄圣君.
基于主动采样的深度鲁棒神经网络学习
Robust Deep Neural Network Learning Based on Active Sampling
计算机科学, 2022, 49(7): 164-169. https://doi.org/10.11896/jsjkx.210600044
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!