Computer Science ›› 2020, Vol. 47 ›› Issue (4): 204-210.doi: 10.11896/jsjkx.190100097

• Artificial Intelligence • Previous Articles     Next Articles

Sentiment Classification Method for Sentences via Self-attention

YU Shan-shan1, SU Jin-dian2, LI Peng-fei2   

  1. 1 College of Medical Information Engineering,Guangdong Pharmaceutical University,Guangzhou 510006,China;
    2 College of Computer Science and Engineering,South China University of Technology,Guangzhou 510640,China
  • Received:2019-01-13 Online:2020-04-15 Published:2020-04-15
  • Contact: SU Jin-dian,born in 1980,Ph.D,asso-ciate professor.His main research inte-rests include natural language proces-sing,artifical intelligence,machine lear-ning.
  • About author:YU Shan-shan,born in 1980,Ph.D,is senior member of China Computer Fede-ration.Her main research interests include machine learning,big data and semantic Web.
  • Supported by:
    This work was supported bythe Natural Science Foundation of Guangdong Province (2015A030310318),Applied Scientific and Technology Special Project of Department of Science and Technology of Guangdong Province (20168010124010),and Medical Scientific Research Foundation of Guangdong Province(A2015065)

Abstract: Although attention mechanisms are widely used in many natural language processing tasks,there still lacks of related works about its applications in sentence-level sentiment classification.By taking advantage of self-attention mechanism in learning important local features of sentences,a multi-layer attentional neural network based on long-short term memory network (LSTM) and attention mechanism,named AttLSTM,was proposed and then applied into the fields of sentiment classification for sentences.AttLSTM firstly uses LSTM network to capture the contexts of sentences,and then takes self-attention functions to learn the position information about words in the sentences and builds the corresponding position weight matrix,which yields the final semantic representations of the sentences by weighted averaging.Finally,the results is classified and outputted via a multi-layer perceptron.The experiment results show that AttLSTM outperforms some relative works and achieves the highest accuracy of 82.8%,88.3% and 91.3% respectively on open two-class sentiment classification corpora,including Movie Reviews (MR),Stanford Sentiment Treebank (SSTb2) and Internet Movie Database (IMDB),as well as 50.6% for multi-class classification corpora SSTb5.

Key words: Deep learning, Sentiment classification, Self-attention, Long-short term memory, Natural language processing

CLC Number: 

  • TP183
[1]WAGN X,LIU Y C,SUNET C J,et al.Predicting Polarities of Tweets by Composing Word Embeddings with Long Short-Term Memory[C]//Proc.of the 53rd Annual Meeting of the Ass.for Computation Linguistics and the 7th Int.Joint Conf.on Natural Language Processing.Stroudsburg:Association for Computational Linguistics,2015:1343-1353.
[2]TAI K S,SOCHER R,MANNING C D.Improved SemanticRepresentations from Tree Structured Long Short-term Memory Networks[C]//Proc.of the 53rd Annual Meeting of the Ass.for Computational Linguistics and the 7th Int.Joint Conf.on Natural Language Processing.Stroudsburg:Association for Computational Linguistics,2015:1556-1566.
[3]SOCHER R,PERELYGIN A,WU J,et al.Recursive Deep Models for Semantic Compositionality over a Sentiment Treebank[C]//Proc.of the 2013 Conf.on Empirical Methods in Natural Language Processing.California:Stanford Press,2013:1631-1642.
[4]BAHDANAU D,CHO K Y,BENGIO Y.Neural MachineTranslation By Jointly Learning to Align and Translate [C]//ICLR 2015.New York:Cornell University Press,2015.
[5]LING W,TSVEYKOV Y,AMIR S,et al.Not All Contexts Are Created Equal:Better Word Representations with Variable Attention[C]//Conf.on Empirical Methods in Natural Language Processing.Stroudsburg:Ass.for Computational Linguistics,2015:1367-1372.
[6]LUONG M T,PHAM H,MANNING C D.Effective Approaches to Attention-based Neural Machine Translation[C]//EMNLP 2015.Stroudsburg:Association for Computational Linguistics,2015:1412.
[7]YANG Z C,YANG D Y,DYER C,et al.Hierarchical Attention Networks for Document Classification[C]//Association for Computational Linguistics,NACCL 2016.Stroudsburg,2017:1480-1489.
[8]PAULUS R,XIONG C M,SOCHER R.A Deep ReinforcedModel for Abstractive Summarization[C]//International Conference on Learning Representations (ICLR 2018).2017.
[9]LI L F,NIE Y P,HAN W H,et al.A Multi-attention-Based Bidirectional Long Short-Term Memory Network for Relation Extraction[C]//ICONIP 2017.Berlin:Springer,2017:216-227.
[10]CHENG J P,LI D,LAPATA M.Long Short-Term MemoryNetworks for Machine Reading[C]//Association for Computational Linguistics,EMNLP 2016.Stroudsburg,2016:551-561.
[11]PARIKH A,TACKSTROM O,DAS D,et al.A Decomposable Attention Model for Natural Language Inference[C]//Association for Computational Linguistics,EMNLP 2016.Stroudsburg,2016:2249-2255.
[12]LIN Z H,FENG M W,SANTOS C N,et al.A Structured Self-Attentive Sentence Embedding[C]//ICLR 2017.New York:Cornell University Press,2017.
[13]SHEN T,JIANG J,ZHOU T Y,et al.DiSAN:Directional Self-Attention Network for RNN/CNN-Free Language Understan-ding[C]//AAAI-18.2018:5446-5455.
[14]SRIVASTAVA N,HINTON G,KRIZHEVSKV A.Dropout:A Simple Way to Prevent Neural Networks from Overfitting[J].Journal of Machine Learning Research,2014,15(1):1929-1958.
[15]PANG B,LEE L.Seeing Starts:Exploiting Class Relationshipsfor Sentiment Categorization with Respect to Rating Scales[C]//ACL 2005.NY:ACM Press,2005:115-124.
[16]MAAS A L,DALY R E,PHAM P T,et al.Learning Word Vectors for Sentiment Analysis[C]//The 49th Annual Meeting of the Ass.for Computational Linguistics.Stroudsburg:Association for Computational Linguistics,2011:142-150.
[17]JOULIN A,GRAVE E,BOJANOWSKI P,et al.Bag of Tricks for Efficient Text Classification[C]//Association for Computational Linguistics,EACL 2017.Stroudsburg.2017:427-431.
[18]LE Q V,MIKOLOV T.Distributed Representations of Sentences and Documents[C]//31st International Conference on Machine Learning.Beijing:International Machine LearningSo-ciety,2014:1188-1196.
[19]NGUYEN D Q,VU T,PHAM S B.Sentiment Classification on Polarity Reviews:An Empirical Study Using Rating-based Features[C]//Proc.of the 5th Workshop on Computational Approaches to Subjectivity,Sentiment and Social Media Analysis.Stroudsburg:Association for Computational Linguistics,2014:128-135.
[20]KIM Y.Convolutional Neural Networks for Sentence Classification[C]//Association for Computational Linguistics,EMNLP 2014.Stroudsburg,2014:1746-1751.
[1] ZHANG Peng, SONG Yi-fan, ZONG Li-bo, LIU Li-bo. Advances in 3D Object Detection:A Brief Survey [J]. Computer Science, 2020, 47(4): 94-102.
[2] CAI Qiang, DENG Yi-biao, LI Hai-sheng, YU Le, MING Shao-feng. Survey on Human Action Recognition Based on Deep Learning [J]. Computer Science, 2020, 47(4): 85-93.
[3] HU Chao-wen, YANG Ya-lian, WU Chang-xing. Survey of Implicit Discourse Relation Recognition Based on Deep Learning [J]. Computer Science, 2020, 47(4): 157-163.
[4] ZHANG Peng-fei, LI Guan-yu, JIA Cai-yan. Truncated Gaussian Distance-based Self-attention Mechanism for Natural Language Inference [J]. Computer Science, 2020, 47(4): 178-183.
[5] LIU Yan, LEI Yin-jie, NING Qian. Study of Crowd Counting Algorithm of “Weak Supervision” Dense Scene Based on DeepNeural Network [J]. Computer Science, 2020, 47(4): 184-188.
[6] KANG Yan,CUI Guo-rong,LI Hao,YANG Qi-yue,LI Jin-yuan,WANG Pei-yao. Software Requirements Clustering Algorithm Based on Self-attention Mechanism and Multi- channel Pyramid Convolution [J]. Computer Science, 2020, 47(3): 48-53.
[7] LI Tai-song,HE Ze-yu,WANG Bing,YAN Yong-hong,TANG Xiang-hong. Session-based Recommendation Algorithm Based on Recurrent Temporal Convolutional Network [J]. Computer Science, 2020, 47(3): 103-109.
[8] CHEN Li-fu,LIU Yan-zhi,ZHANG Peng,YUAN Zhi-hui,XING Xue-min. Road Extraction Algorithm of Multi-feature High-resolution SAR Image Based on Multi-Path RefineNet [J]. Computer Science, 2020, 47(3): 156-161.
[9] LI Zhou-jun,FAN Yu,WU Xian-jie. Survey of Natural Language Processing Pre-training Techniques [J]. Computer Science, 2020, 47(3): 162-173.
[10] ANG Wei-yi,BAI Chen-jia,CAI Chao,ZHAO Ying-nan,LIU Peng. Survey on Sparse Reward in Deep Reinforcement Learning [J]. Computer Science, 2020, 47(3): 182-191.
[11] LIU Xiao-ling,LIU Bai-song,WANG Yang-yang,TANG Hao. Research and Development of Multi-label Generation Based on Deep Learning [J]. Computer Science, 2020, 47(3): 192-199.
[12] HUANG Hong-wei,LIU Yu-jiao,SHEN Zhuo-kai,ZHANG Shao-wei,CHEN Zhi-min,GAO Yang. End-to-end Track Association Based on Deep Learning Network Model [J]. Computer Science, 2020, 47(3): 200-205.
[13] LI Ke,CHEN Guang-ping. Mining Deep Semantic Features of Reviews for Amazon Commodity Recommendation [J]. Computer Science, 2020, 47(2): 65-71.
[14] FU Xue-yang,SUN Qi,HUANG Yue,DING Xing-hao. Single Image De-raining Method Based on Deep Adjacently Connected Networks [J]. Computer Science, 2020, 47(2): 106-111.
[15] LUO Yue-tong,BIAN Jing-shuai,ZHANG Meng,RAO Yong-ming,YAN Feng. Detection Method of Chip Surface Weak Defect Based on Convolution Denoising Auto-encoders [J]. Computer Science, 2020, 47(2): 118-125.
Full text



[1] CHEN Jin-yin, SHI Jin, DU Wen-yao and WU Yang-yang. MB-RRT* Based Navigation Planning Algorithm for UAV[J]. Computer Science, 2017, 44(8): 198 -206 .
[2] YANG Dong-ju and LI Qing. Scheduling Strategy of Hierarchical Storage about Replication in Cloud Storage[J]. Computer Science, 2017, 44(4): 85 -89 .
[3] ZHANG Fan,DU Bo,ZHANG Liang-pei and ZHANG Le-fei. Band Grouping Based Hyperspectral Image Classification Using Mathematical Morphology and Support Vector Machines[J]. Computer Science, 2014, 41(12): 275 -279 .
[4] DU Xiu-li, GU Bin-bin, HU Xing, QIU Shao-ming and CHEN Bo. Support Similarity between Lines Based CoSaMP Algorithm for Image Reconstruction[J]. Computer Science, 2018, 45(4): 306 -311 .
[5] LV Ju-jian, ZHAO Hui-min, CHEN Rong-jun, LI Jian-hong. Unsupervised Active Learning Based on Adaptive Sparse Neighbors Reconstruction[J]. Computer Science, 2018, 45(6): 251 -258 .
[6] CAO Meng-xiao, ZHANG Gui-juan, HUANG Li-jun and LIU Hong. Crowd Animation Generation Method Based on Personalized Emotional Contagion[J]. Computer Science, 2017, 44(6): 306 -311, 316 .
[7] LI Fang-wei HUANG Xu ZHANG Hai-bo LIU Kai-jian HE Xiao-fan. Cluster-based Radio Resource Allocation Mechanism in D2D Networks[J]. Computer Science, 2018, 45(9): 123 -128, 165 .
[8] XIE Yan-rong, MA Wen-ping, LUO Wei. New Cross-domain Authentication Model for Information Services Entity[J]. Computer Science, 2018, 45(9): 177 -182 .
[9] YANG Xin, LI Tian-rui, LIU Dun, FANG Yu, WANG Ning. Generalized Sequential Three-way Decisions Approach Based on Decision-theoretic Rough Sets[J]. Computer Science, 2018, 45(10): 1 -5, 20 .
[10] WU Jing, YANG Wu-nian, SANG Qiang. Object Contour Extraction Algorithm Based on Biological Visual Feature[J]. Computer Science, 2018, 45(10): 281 -285 .