Computer Science ›› 2020, Vol. 47 ›› Issue (4): 204-210.doi: 10.11896/jsjkx.190100097

• Artificial Intelligence • Previous Articles     Next Articles

Sentiment Classification Method for Sentences via Self-attention

YU Shan-shan1, SU Jin-dian2, LI Peng-fei2   

  1. 1 College of Medical Information Engineering,Guangdong Pharmaceutical University,Guangzhou 510006,China;
    2 College of Computer Science and Engineering,South China University of Technology,Guangzhou 510640,China
  • Received:2019-01-13 Online:2020-04-15 Published:2020-04-15
  • Contact: SU Jin-dian,born in 1980,Ph.D,asso-ciate professor.His main research inte-rests include natural language proces-sing,artifical intelligence,machine lear-ning.
  • About author:YU Shan-shan,born in 1980,Ph.D,is senior member of China Computer Fede-ration.Her main research interests include machine learning,big data and semantic Web.
  • Supported by:
    This work was supported bythe Natural Science Foundation of Guangdong Province (2015A030310318),Applied Scientific and Technology Special Project of Department of Science and Technology of Guangdong Province (20168010124010),and Medical Scientific Research Foundation of Guangdong Province(A2015065)

Abstract: Although attention mechanisms are widely used in many natural language processing tasks,there still lacks of related works about its applications in sentence-level sentiment classification.By taking advantage of self-attention mechanism in learning important local features of sentences,a multi-layer attentional neural network based on long-short term memory network (LSTM) and attention mechanism,named AttLSTM,was proposed and then applied into the fields of sentiment classification for sentences.AttLSTM firstly uses LSTM network to capture the contexts of sentences,and then takes self-attention functions to learn the position information about words in the sentences and builds the corresponding position weight matrix,which yields the final semantic representations of the sentences by weighted averaging.Finally,the results is classified and outputted via a multi-layer perceptron.The experiment results show that AttLSTM outperforms some relative works and achieves the highest accuracy of 82.8%,88.3% and 91.3% respectively on open two-class sentiment classification corpora,including Movie Reviews (MR),Stanford Sentiment Treebank (SSTb2) and Internet Movie Database (IMDB),as well as 50.6% for multi-class classification corpora SSTb5.

Key words: Deep learning, Long-short term memory, Natural language processing, Self-attention, Sentiment classification

CLC Number: 

  • TP183
[1]WAGN X,LIU Y C,SUNET C J,et al.Predicting Polarities of Tweets by Composing Word Embeddings with Long Short-Term Memory[C]//Proc.of the 53rd Annual Meeting of the Ass.for Computation Linguistics and the 7th Int.Joint Conf.on Natural Language Processing.Stroudsburg:Association for Computational Linguistics,2015:1343-1353.
[2]TAI K S,SOCHER R,MANNING C D.Improved SemanticRepresentations from Tree Structured Long Short-term Memory Networks[C]//Proc.of the 53rd Annual Meeting of the Ass.for Computational Linguistics and the 7th Int.Joint Conf.on Natural Language Processing.Stroudsburg:Association for Computational Linguistics,2015:1556-1566.
[3]SOCHER R,PERELYGIN A,WU J,et al.Recursive Deep Models for Semantic Compositionality over a Sentiment Treebank[C]//Proc.of the 2013 Conf.on Empirical Methods in Natural Language Processing.California:Stanford Press,2013:1631-1642.
[4]BAHDANAU D,CHO K Y,BENGIO Y.Neural MachineTranslation By Jointly Learning to Align and Translate [C]//ICLR 2015.New York:Cornell University Press,2015.
[5]LING W,TSVEYKOV Y,AMIR S,et al.Not All Contexts Are Created Equal:Better Word Representations with Variable Attention[C]//Conf.on Empirical Methods in Natural Language Processing.Stroudsburg:Ass.for Computational Linguistics,2015:1367-1372.
[6]LUONG M T,PHAM H,MANNING C D.Effective Approaches to Attention-based Neural Machine Translation[C]//EMNLP 2015.Stroudsburg:Association for Computational Linguistics,2015:1412.
[7]YANG Z C,YANG D Y,DYER C,et al.Hierarchical Attention Networks for Document Classification[C]//Association for Computational Linguistics,NACCL 2016.Stroudsburg,2017:1480-1489.
[8]PAULUS R,XIONG C M,SOCHER R.A Deep ReinforcedModel for Abstractive Summarization[C]//International Conference on Learning Representations (ICLR 2018).2017.
[9]LI L F,NIE Y P,HAN W H,et al.A Multi-attention-Based Bidirectional Long Short-Term Memory Network for Relation Extraction[C]//ICONIP 2017.Berlin:Springer,2017:216-227.
[10]CHENG J P,LI D,LAPATA M.Long Short-Term MemoryNetworks for Machine Reading[C]//Association for Computational Linguistics,EMNLP 2016.Stroudsburg,2016:551-561.
[11]PARIKH A,TACKSTROM O,DAS D,et al.A Decomposable Attention Model for Natural Language Inference[C]//Association for Computational Linguistics,EMNLP 2016.Stroudsburg,2016:2249-2255.
[12]LIN Z H,FENG M W,SANTOS C N,et al.A Structured Self-Attentive Sentence Embedding[C]//ICLR 2017.New York:Cornell University Press,2017.
[13]SHEN T,JIANG J,ZHOU T Y,et al.DiSAN:Directional Self-Attention Network for RNN/CNN-Free Language Understan-ding[C]//AAAI-18.2018:5446-5455.
[14]SRIVASTAVA N,HINTON G,KRIZHEVSKV A.Dropout:A Simple Way to Prevent Neural Networks from Overfitting[J].Journal of Machine Learning Research,2014,15(1):1929-1958.
[15]PANG B,LEE L.Seeing Starts:Exploiting Class Relationshipsfor Sentiment Categorization with Respect to Rating Scales[C]//ACL 2005.NY:ACM Press,2005:115-124.
[16]MAAS A L,DALY R E,PHAM P T,et al.Learning Word Vectors for Sentiment Analysis[C]//The 49th Annual Meeting of the Ass.for Computational Linguistics.Stroudsburg:Association for Computational Linguistics,2011:142-150.
[17]JOULIN A,GRAVE E,BOJANOWSKI P,et al.Bag of Tricks for Efficient Text Classification[C]//Association for Computational Linguistics,EACL 2017.Stroudsburg.2017:427-431.
[18]LE Q V,MIKOLOV T.Distributed Representations of Sentences and Documents[C]//31st International Conference on Machine Learning.Beijing:International Machine LearningSo-ciety,2014:1188-1196.
[19]NGUYEN D Q,VU T,PHAM S B.Sentiment Classification on Polarity Reviews:An Empirical Study Using Rating-based Features[C]//Proc.of the 5th Workshop on Computational Approaches to Subjectivity,Sentiment and Social Media Analysis.Stroudsburg:Association for Computational Linguistics,2014:128-135.
[20]KIM Y.Convolutional Neural Networks for Sentence Classification[C]//Association for Computational Linguistics,EMNLP 2014.Stroudsburg,2014:1746-1751.
[1] RAO Zhi-shuang, JIA Zhen, ZHANG Fan, LI Tian-rui. Key-Value Relational Memory Networks for Question Answering over Knowledge Graph [J]. Computer Science, 2022, 49(9): 202-207.
[2] WU Zi-yi, LI Shao-mei, JIANG Meng-han, ZHANG Jian-peng. Ontology Alignment Method Based on Self-attention [J]. Computer Science, 2022, 49(9): 215-220.
[3] TANG Ling-tao, WANG Di, ZHANG Lu-fei, LIU Sheng-yun. Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy [J]. Computer Science, 2022, 49(9): 297-305.
[4] XU Yong-xin, ZHAO Jun-feng, WANG Ya-sha, XIE Bing, YANG Kai. Temporal Knowledge Graph Representation Learning [J]. Computer Science, 2022, 49(9): 162-171.
[5] WANG Jian, PENG Yu-qi, ZHAO Yu-fei, YANG Jian. Survey of Social Network Public Opinion Information Extraction Based on Deep Learning [J]. Computer Science, 2022, 49(8): 279-293.
[6] HAO Zhi-rong, CHEN Long, HUANG Jia-cheng. Class Discriminative Universal Adversarial Attack for Text Classification [J]. Computer Science, 2022, 49(8): 323-329.
[7] JIANG Meng-han, LI Shao-mei, ZHENG Hong-hao, ZHANG Jian-peng. Rumor Detection Model Based on Improved Position Embedding [J]. Computer Science, 2022, 49(8): 330-335.
[8] FANG Yi-qiu, ZHANG Zhen-kun, GE Jun-wei. Cross-domain Recommendation Algorithm Based on Self-attention Mechanism and Transfer Learning [J]. Computer Science, 2022, 49(8): 70-77.
[9] CHEN Kun-feng, PAN Zhi-song, WANG Jia-bao, SHI Lei, ZHANG Jin. Moderate Clothes-Changing Person Re-identification Based on Bionics of Binocular Summation [J]. Computer Science, 2022, 49(8): 165-171.
[10] SUN Qi, JI Gen-lin, ZHANG Jie. Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection [J]. Computer Science, 2022, 49(8): 172-177.
[11] YAN Jia-dan, JIA Cai-yan. Text Classification Method Based on Information Fusion of Dual-graph Neural Network [J]. Computer Science, 2022, 49(8): 230-236.
[12] HU Yan-yu, ZHAO Long, DONG Xiang-jun. Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification [J]. Computer Science, 2022, 49(7): 73-78.
[13] CHENG Cheng, JIANG Ai-lian. Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction [J]. Computer Science, 2022, 49(7): 120-126.
[14] HOU Yu-tao, ABULIZI Abudukelimu, ABUDUKELIMU Halidanmu. Advances in Chinese Pre-training Models [J]. Computer Science, 2022, 49(7): 148-163.
[15] ZHOU Hui, SHI Hao-chen, TU Yao-feng, HUANG Sheng-jun. Robust Deep Neural Network Learning Based on Active Sampling [J]. Computer Science, 2022, 49(7): 164-169.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!