Computer Science ›› 2017, Vol. 44 ›› Issue (10): 296-301.doi: 10.11896/j.issn.1002-137X.2017.10.053

Previous Articles     Next Articles

Extraction Method of Sentimental Feature Vector Based on Semantic Similarity

LIN Jiang-hao, ZHOU Yong-mei, YANG Ai-min and CHENG Jin   

  • Online:2018-12-01 Published:2018-12-01

Abstract: In order to fill the gap of the semantic representation and domain expansion on sentimental features,in this paper,an extraction method of sentimental feature vector based on semantic similarity was proposed.First of all,the Word2vec model is trained based on 250 thousand sogou news texts and 500 thousand micro-blog texts.Eighty sentimental words,which are obvious sentiment,rich content and diverse POS,are chosen as a set of seed words.Then,the semantic similarity between the candidate sentimental words and the seed words are calculated based on their word vectors.The sentimental words are mapped to the high dimensional vector space and the feature vector representation (Senti2vec) is extracted.Senti2vec is applied into the similarity analysis of sentimental synonyms and antonyms,polarity classification of sentimental words and sentimental text analysis.The experimental results show that Senti2vec can represent the meaning and sentiment of the sentimental words.Senti2vec is based on semantic similarity calculation from large scale of data,which enables this method more adaptable into different domains.

Key words: Sentimental feature vector,Semantic similarity,Sentiment word,Word2vec

[1] XU G,MENG X F,WANG H F.Build Chinese Emotion Lexicons Using A Graph-based Algorithm and Multiple Resources[C]∥Proceedings of the 23rd International Conference on Computational Linguistics.2010:1209-1217.
[2] BACCIANELLA S,ESUL A,SEBASTIANI F.SentiWordNet3.0:An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining[C]∥International Conference on Language Resources and Evaluation(Lrec 2010).Valletta,Malta,2010:83-90.
[3] DAI L L,XIA Y N,LIU B,et al.Measuring Semantic Similarity between Words Using HowNet[C]∥Proceedings of the 2008 International Conference on Computer Science and Information Technology.2008:601-605.
[4] TABOADA M,BROOKE J,TOFILOSKI M,et al.Lexicon-based methods for sentiment analysis[J].Computational linguistics,2011,37(2):267-307.
[5] DRAGUT E C,WANG H,SISTLA P,et al.Polarity Consistency Checking for Domain Independent Sentiment Dictionaries[J].IEEE Transactions on Knowledge and Data Engineering,2015,27(3):838-851.
[6] VO D T,ZHANG Y.Don’t Count,Predict! An Automatic Approach to Learning Sentiment Lexicons for Short Text[C]∥The 54th Annual Meeting of the Association for Computational Linguistics.2016:219.
[7] ZHU Y L,MIN J,ZHOU Y Q,et al.Semantic orientation computing based on HowNet[J].Journal of Chinese Information Processing,2006,0(1):14-20.(in Chinese) 朱嫣岚,闵锦,周雅倩,等.基于hownet的词汇语义倾向计算[J].中文信息学报,2006,20(1):14-20.
[8] LIU W P,ZHU Y H,LI C L,et al.Research on building Chinesebasic semantic lexicon[J].Journal of Computer Applications,2009,29(11):2882-2884.(in Chinese) 柳位平,朱艳辉,栗春亮,等.中文基础情感词词典构建方法研究[J].计算机应用,2009,9(11):2882-2884.
[9] ZHOU Y M,YANG A M,YANG J N.Construction Method of Sentiment Lexicon for News Reviews[J].Computer Science,2014,41(8):67-69.(in Chinese) 周咏梅,阳爱民,杨佳能.一种新闻评论情感词典的构建方法[J].计算机科学,2014,41(8):67-69.
[10] YANG A M,LIN J H,ZHON Y M,et al.Research on Building a Chinese Sentiment Lexicon Based on SO-PMI[J].Applied Mechanics and Materials,2013,263-266:1688-1693.
[11] ZHOU Y M,YANG A M,LIN J H.A method of building Chinese microblog sentiment lexicon[J].Journal of Shandong University (Engineering Science),2014,44(3):36-40.(in Chinese) 周咏梅,阳爱民,林江豪.中文微博情感词典构建方法[J].山东大学学报(工学版),2014,44(3):36-40.
[12] WANG G W,ARAKI K.Modifying SO-PMI for Japanese Web-log Opinion Mining by Using a Balancing Factor and Detecting Neutral Expressions[C]∥Proceedings of NAACL HLT.2007:189-192.
[13] PENG L Z,WU Y Y.Semantic Similarity Computing Based on Community Mining of Wikipedia[J].Computer Science,2016,43(4):45-49.(in Chinese) 彭丽针,吴扬扬.基于维基百科社区挖掘的词语语义相似度计算[J].计算机科学,2016,43(4):45-49.
[14] TAO F M,GAO J,WANG T J,et al.Topic Oriented Sentimental Feature Selection Method for News Comments[J].Journal of Chinese Information Processing,2010,24(3):37-43.(in Chinese) 陶富民,高军,王腾蛟,等.面向话题的新闻评论的情感特征选取[J].中文信息学报,2010,24(3):37-43.
[15] LI S K,JIANG Y B.Semi-Supervised Sentiment ClassificationBased on Sentiment Feature Clustering[J].Journal of Computer Research and Development,2013,0(12):2570-2577.(in Chinese) 李素科,蒋严冰.基于情感特征聚类的半监督情感分类[J].计算机研究与发展,2013,0(12):2570-2577.
[16] HE F Y,HE Y X,LIU N,et al.A Microblog Short Text Oriented Multi-class Feature Extraction Method of Fine-Grained Sentiment Analysis [J].Acta Scientiarum Naturalium Universitatis Pekinensis,2014,50(1):48-54.(in Chinese) 贺飞艳,何炎祥,刘楠,等.面向微博短文本的细粒度情感特征抽取方法[J].北京大学学报(自然科学版),2014,0(1):48-54.
[17] WU J Y,JI J Z,ZHAO X W,et al.Weight Calculation of Emotional Word Based on Feature Selection Technique[J].Journal of Beijing University of Technology,2016,2(1):142-151.(in Chinese) 吴金源,冀俊忠,赵学武,等.基于特征选择技术的情感词权重计算[J].北京工业大学学报,2016,2(1):142-151.
[18] PENNINGTON J,SOCHER R,MANNING C.Glove:GlobalVectors for Word Representation[C]∥Conference on Empirical Methods in Natural Language Processing.2014:1532-1543.
[19] TSVETKOV Y,FARUQUI M,DYER C.Correlation-based Intrinsic Evaluation of Word Vector Representations[C]∥The Workshop on Evaluating Vector-Space Representations for Nlp.2016:111-115.
[20] CAMACHO-COLLADOS J,NAVIGLI R.Find the word thatdoes not belong:A Framework for an Intrinsic Evaluation of Word Vector Representations[C]∥The Workshop on Evaluating Vector-Space Representations for Nlp.2016:43-50.
[21] HAMOUDA A,MAREI M,ROHAIM M.Building MachineLearning Based Senti-word Lexicon for Sentiment Analysis[J].J ournal of Advances in Information Technology,2011,2(4):199-203.
[22] VAN DER MAATEN L J P.Accelerating t-SNE using Tree-Based Algorithms[J].Journal of Machine Learning Research,2014,15(1):3221-3245.
[23] ZHOU Y M,YANG J N,YANG A M.A method on building Chinese sentiment lexicon for text sentiment analysis[J].Journal of Shandong University (Engineering Science),2013,3(6):27-33.(in Chinese) 周咏梅,杨佳能,阳爱民.面向文本情感分析的中文情感词典构建方法[J].山东大学学报(工学版),2013,3(6):27-33.
[24] YANG D,YANG A M.Classification approach of Chinesetexts sentiment based on semantic lexicon and nave Bayesian[J].Application Research of Computers,2010,27(10):3737-3739,3743.(in Chinese) 杨鼎,阳爱民.一种基于情感词典和朴素贝叶斯的中文文本情感分类方法[J].计算机应用研究,2010,27(10):3737-3739,3743.

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!