Computer Science ›› 2020, Vol. 47 ›› Issue (11): 275-279.doi: 10.11896/jsjkx.191000174

• Artificial Intelligence • Previous Articles     Next Articles

Semantic Similarity-based Method for Sentiment Classification

MA Xiao-hui1, JIA Jun-zhi2, ZHOU Xiang-zhen3, YAN Jun-ya1   

  1. 1 Information Faculty,Business College of Shanxi University,Taiyuan 030031,China
    2 School of Information Resource Management,Renmin University of China,Beijing 100872,China
    3 National Academy of Economic Strategy,Chinese Academy of Social Sciences,Beijing 100028,China
  • Received:2019-10-27 Revised:2019-12-19 Online:2020-11-15 Published:2020-11-05
  • About author:MA Xiao-hui,born in 1982,master,associate professor.Her main research interests include information retrieval,computer application technology and sentiment analysis.
  • Supported by:
    This work was supported by the Key Program of Shanxi Provincial Department of Science and Technology (201603D321112),13th Five-year Plan of Shanxi Provincial Education Department (GH-17097),Young Scientists Fund of the National Natural Science Foundation of China (61702026) and 2018 Science and Technology Research Project of Henan Province (182102110277).

Abstract: The sentiment lexicon is helpful for sentiment analysis and can be used to classify sentiment by word matching.However,sentiment lexicon has some limitations in terms of vocabulary coverage and domain adaptation.Therefore,this paper proposes a sentiment classification method based on semantic similarity measurement and embedding representation,which calculates the semantic similarity between the text to be classified and the sentiment lexicon,and combines semantic distance and embedding-based features to classify sentiment,so it is helpful to solve the problem of insufficient use of semantic features.In this paper,the performance of sentiment classification is evaluated by the feature vector extraction from word vectors,sentiment lexicon matching and the proposed method.Experimental results show that this method is better than the comparison method.In the corpus of three e-commerce comment tests,the average F1 value of the proposed method reaches 83.46%,an increase of 8.26% compared with the comparison method.Among them,semantic classification extracted by combining word embedding and ECSD(E-Commerce Sentiment Dictionary) has the best effect,with a performance improvement of 9%,indicating that the extracted emotional semantic features can be enriched by combining semantic similarity,and the performance of emotional classification can be effectively improved.

Key words: Feature selection, Semantic similarity, Sentiment classification, Sentiment lexicon, Word embedding

CLC Number: 

  • TP391
[1] CAMBRIA E,PORIA S,GELBUKH A,et al.Sentiment Analysis Is a Big Suitcase[J].IEEE Intelligent Systems,2017,32(6):74-80.
[2] LIU B.Sentiment Analysis:Mining Opinions,Sentiments,andEmotions[M].Cambridge University Press,2015:7-8.
[3] TABOADA M,BROOKE J,TOFILOSKI M,et al.Lexicon-based methods for sentiment analysis[J].Computational Linguistics,2011,37(2):267-307.
[4] CAMBRIA E,SCHULLER B,XIA Y,et al.New avenues inopinion mining and sentiment analysis[J].IEEE Intelligent Systems,2013,28(2):15-21.
[5] DING X,LIU B,YU P S.A holistic lexicon-based approach toopinion mining[C]//Proceedings of the 2008International Conference on Web Search and Data Mining.Palo Alto:ACM,2008:231-240.
[6] LE Q,MIKOLOV T.Distributed representations of sentencesand documents[C]//Proceedings of the 31st International Conference on Machine Learning.Beijing:JMLR,2014:1188-1196.
[7] ALPAYDIN E.Introduction to Machine Learning[M].London:MIT press,2014:127-130.
[8] GAO M Z.Research on Sentiment Classification and OpinionMining Technique of Online Reviews[D].Changsha:National University of Defense Technology,2014.
[9] KAMPS J,MARX M,MOKKEN R J,et al.Using Wordnet to Measure Semantic Orientation of Adjectives[C]//Proceedings of the Fourth International Conference on Language Resources and Evaluation.Lisbon:ELRA,2004:1115-1118.
[10] GUERINI M,LORENZO G,MARCO T.Sentiment Analysis:How to Derive Prior Polarities from Sentiwordnet[C]//Proceedings of the 2013 Conference of Empirical Methods on Natural Language Processing.Washington:Association for Computational Linguistics,2013:1259-1269.
[11] LI C J.Text sentiment polarity analysis based on Chinese reviews in hotel domain[D].Guangzhou:South China University of Technology,2016.
[12] HAMILTON W L,CLARK K,LESKOVEC J,et al.Inducingdomain-specific sentiment lexicons from unlabeled corpora[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing.Austin:Association for Computational Linguistics,2016:595-605.
[13] LUO S L,MAO Y Y,PAN L M,et al.A Method of Text Sentiment Classification by Extending Semantic Similar Sentiment Words[J].Transactions of Beijing Institute of Technology,2018,38(11):1156-1162,1176.
[14] ZHU G,IGLESIAS C A.Computing semantic similarity of concepts in knowledge graphs[J].IEEE Transactions on Knowledge and Data Engineering,2017,29(1):72-85.
[15] GLIGOROV R,TEN KATE W,ALEKSOVSKI Z,et al.Using google distance to weight approximate ontology matches[C]//Proceedings of the 16th International Conference on World Wide Web.New York:ACM,2007:767-776.
[16] MIKOLOV T,CORRADO G,CHEN K,et al.Efficient Estimation of Word Representations in Vector Space[C]//Proceedings of the International Conference on Learning Representations.2013:1-12.
[17] BUDANITSKY A,HIRST G.Evaluating wordnet-based measures of lexical semantic relatedness[J].Computational Linguistics,2006,32(1):13-47.
[18] BENGIO Y,DUCHARME R,VINCENT P.A Neural Probabilistic Language Model[J].Journal of Machine Learning Research,2003,3:1137-1155.
[19] GOLDBERG Y.A Primer on Neural Network Models for Natural Language Processing[J].Journal of Artificial Intelligence Research,2016,57:345-420.
[20] WANG Y,TAO Y Z,ZHANG Q.Research on sentiment orientation of product feature from Chinese reviews on the internet[J].Journal of Chongqing University of Posts and Telecommunications(Natural Science Edition),2017,29(1):75-83.
[21] YANG W,SONG J J,TANG J Q. A Study on the Classification Approach for Chinese MicroBlog Subjective and Objective Sentences [J].Journal of Chongqing University of Technology(Natural Science),2013,27(1):51-56.
[22] PORIA S,CAMBRIA E,GELBUKH A.Aspect Extraction for Opinion Mining with a Deep Convolutional Neural Network[J].Knowledge-Based Systems,2016,108:42-49.
[23] SCHNABEL T,LABUTOV I,MIMNO D,et al.Evaluationmethods for unsupervised word embeddings[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing.Lisbon:Association for Computational Linguistics,2015:298-307.
[24] ARAQUE O,CORCUERA-PLATAS I,SÁNCHEZ-RADA J F,et al.Enhancing deep learning sentiment analysis with ensemble techniques in social applications[J].Expert Systems with Applications,2017,77:236-246.
[25] DAI A M,LE Q V.Semi-supervised sequence learning[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems.Montreal:MIT Press Cambridge,2015:3079-3087.
[26] KIM Y.Convolutional neural networks for sentence classification[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing.Doha:Association for Computational Linguistics,2014:1746-1751.
[27] RUDER S,GHAFFARI P,BRESLIN J G.INSIGHT-1 at Se-mEval-2016 Task 5:Deep Learning for Multilingual Aspect-based Sentiment Analysis[C]//Proceedings of SemEval-2016.San Diego:2016 Association for Computational Linguistics,2016:330-336.
[28] TANG D,WEI F,QIN B,et al.Sentiment embeddings with applications to sentiment analysis[J].IEEE Transactions on Knowledge and Data Engineering,2016,28(2):496-509.
[29] YU S W,LU Q,CHEN W L.Fine-grained Opinion MiningBased on Feature Representation of Domain Sentiment Lexicon[J].Journal of Chinese Information Processing,2019,33(2):112-121.
[1] LI Bin, WAN Yuan. Unsupervised Multi-view Feature Selection Based on Similarity Matrix Learning and Matrix Alignment [J]. Computer Science, 2022, 49(8): 86-96.
[2] HU Yan-yu, ZHAO Long, DONG Xiang-jun. Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification [J]. Computer Science, 2022, 49(7): 73-78.
[3] ZENG Zhi-xian, CAO Jian-jun, WENG Nian-feng, JIANG Guo-quan, XU Bin. Fine-grained Semantic Association Video-Text Cross-modal Entity Resolution Based on Attention Mechanism [J]. Computer Science, 2022, 49(7): 106-112.
[4] HOU Yu-tao, ABULIZI Abudukelimu, ABUDUKELIMU Halidanmu. Advances in Chinese Pre-training Models [J]. Computer Science, 2022, 49(7): 148-163.
[5] KANG Yan, WANG Hai-ning, TAO Liu, YANG Hai-xiao, YANG Xue-kun, WANG Fei, LI Hao. Hybrid Improved Flower Pollination Algorithm and Gray Wolf Algorithm for Feature Selection [J]. Computer Science, 2022, 49(6A): 125-132.
[6] LIN Xi, CHEN Zi-zhuo, WANG Zhong-qing. Aspect-level Sentiment Classification Based on Imbalanced Data and Ensemble Learning [J]. Computer Science, 2022, 49(6A): 144-149.
[7] HAN Hong-qi, RAN Ya-xin, ZHANG Yun-liang, GUI Jie, GAO Xiong, YI Meng-lin. Study on Cross-media Information Retrieval Based on Common Subspace Classification Learning [J]. Computer Science, 2022, 49(5): 33-42.
[8] CHU An-qi, DING Zhi-jun. Application of Gray Wolf Optimization Algorithm on Synchronous Processing of Sample Equalization and Feature Selection in Credit Evaluation [J]. Computer Science, 2022, 49(4): 134-139.
[9] SUN Lin, HUANG Miao-miao, XU Jiu-cheng. Weak Label Feature Selection Method Based on Neighborhood Rough Sets and Relief [J]. Computer Science, 2022, 49(4): 152-160.
[10] PAN Zhi-hao, ZENG Bi, LIAO Wen-xiong, WEI Peng-fei, WEN Song. Interactive Attention Graph Convolutional Networks for Aspect-based Sentiment Classification [J]. Computer Science, 2022, 49(3): 294-300.
[11] LI Zong-ran, CHEN XIU-Hong, LU Yun, SHAO Zheng-yi. Robust Joint Sparse Uncorrelated Regression [J]. Computer Science, 2022, 49(2): 191-197.
[12] LI Yu-qiang, ZHANG Wei-jiang, HUANG Yu, LI Lin, LIU Ai-hua. Improved Topic Sentiment Model with Word Embedding Based on Gaussian Distribution [J]. Computer Science, 2022, 49(2): 256-264.
[13] LIU Kai, ZHANG Hong-jun, CHEN Fei-qiong. Name Entity Recognition for Military Based on Domain Adaptive Embedding [J]. Computer Science, 2022, 49(1): 292-297.
[14] LI Zhao-qi, LI Ta. Query-by-Example with Acoustic Word Embeddings Using wav2vec Pretraining [J]. Computer Science, 2022, 49(1): 59-64.
[15] ZHANG Ye, LI Zhi-hua, WANG Chang-jie. Kernel Density Estimation-based Lightweight IoT Anomaly Traffic Detection Method [J]. Computer Science, 2021, 48(9): 337-344.
Full text



No Suggested Reading articles found!