Computer Science ›› 2015, Vol. 42 ›› Issue (6): 61-66.doi: 10.11896/j.issn.1002-137X.2015.06.014

Previous Articles     Next Articles

Hybrid Algorithm Framework for Sentiment Classification of Chinese Based on Semantic Comprehension and Machine Learning

XU Jian-feng, XU Yuan, XU Yuan-chen, ZHANG Yuan-jian and LIU Qing   

  • Online:2018-11-14 Published:2018-11-14

Abstract: In the background of big data,it is a major challenge to distinguish sentiment orientation from a large number of Internet text information quickly,accurately and comprehensively.The main sentiment classification methods of text information are roughly divided into two categories:one is semantic comprehension and the other is supervised machine learning.The advantage of dealing with sentiment classification by using semantic comprehension method is that it can classify the text in different fields.However,the performance can be greatly affected by avariety of word collocations and sentence patterns.The supervised machine learning method can achieve higher classification accuracy,however,a satisfying classification classifier in a field may not be suitable for a new field.This paper proposed a new hybrid algorithm framework for Chinese sentiment classification combining optimized semantic comprehension and machine lear-ning based on the features extracted by information gain.Experimental results on two separate fields show that this framework has both high classification accuracy and satisfying portability.

Key words: Sentiment classification,Semantic,Machine learning

[1] 中国互联网信息中心.第33次中国互联网络发展状况统计报告[EB/OL].http://wenku.baidu.com/view/0d595bd0551810a6f5248694.html China Internet Network Information Center.The 33rd Statistic Report for the development of China Internet Network [EB/OL].http://wenku.baidu.com/view/0d595bd0551810a6f5248694.html
[2] 赵志伟.中文文本倾向性分析[D].合肥:安徽大学,2012 Zhao Z W.Chinese text Orientation Analysis [D].Hefei:Anhui University,2012
[3] 赵军,许洪波,黄萱菁,等.中文文本情感倾向性分析[J].中国计算机学会通讯,2008,4(2):41-46 Zhao J,Xu H B,Huang X J,et al.Chinese text sentiment tendency analysis [J] Communication of China Computer Federation,2008,4(2):41-46
[4] 赵妍妍,秦兵,刘挺.文本情感分析[J].软件学报,2010,21(8):1834-1848 Zhao Y Y,Qin B,Liu T.Sentiment analysis[J].Journal of Software,2010,1(8):1834-1848
[5] Filatova E,Hatzivassiloglou V.A formal model for information selection in multi-sentence text extraction[C]∥Proceedings of the 20th international conference on Computational Linguistics.Association for Computational Linguistics,2004:397
[6] Recchia G,Jones M N.More data trumps smarter algorithms:Comparing pointwise mutual information with latent semantic analysis[J].Behavior research methods,2009,1(3):647-656
[7] WordNet[EB/OL].http://wordnet.princeton.edu
[8] 知网[EB/OL].http://www.keenage.com HowNet [EB/OL].http://www.keenage.com
[9] Kamps J,Marx M,Mokken R J,et al.Using WordNet to Mea-sure Semantic Orientation of Adjectives[C]∥Proceedings of the 4th International Conference on Language Resources and Evaluation.Lisbon,2004:1115-1118
[10] 朱嫣岚,闵锦,周雅倩,等.基于HowNet的词汇语义倾向计算[J].中文信息学报,2006,0(1):14-20 Zhu Y L,Min J,Zhou Y,et al.Semantic orientation computing based on HowNet[J].Journal of Chinese Information Processing,2006,0(1):14-20
[11] Yi J,Nasukawa T,Bunescu R,et al.Sentiment Analyzer:Extracting Sentiments about a Given Topic using Natural Language Processing Techniques[C]∥the Third IEEE InternationalConference on Data Mining,November 2003.IEEE Computer Society Press,Los Alamitos,2003:427-434
[12] 刘永丹,曾海泉,李荣陆,等.基于语义分析的倾向性文本过滤[J].通信学报,2004,25(7):78-85 Liu Y D,Zeng H Q,Li R L,et al.Polarity text filtering based on semantic analysis [J].Journal on Communications,2004,5(7):78-85
[13] 何凤英.基于语义理解的中文博文倾向性分析[J].计算机应用,2011,31(8):2130-2133 He F Y.Orientation analysis for Chinese blog text based on semantic comprehension[J].Journal of Computer Applications,2011,1(8):2130-2133
[14] Pang Bo,Lee L.Shivakumar Vaithyanathan.Sentiment Classification using Machine Learning Techniques[C]∥the 2002 Conference on Empirical Methods in Natural Language Processing.2002:79-86
[15] 徐琳宏,林鸿飞,杨志豪.基于语义理解的文本倾向性识别机制[J].中文信息学报,2007,21(1):96-100 Xu L H,Lin H F,Yang Z H.Text Orientation Identification Based on Semantic Comprehension [J].Journal of Chinese Information Processing,2007,1(1):96-100
[16] 唐慧丰,谭松波,程学旗.基于监督学习的中文情感分类技术比较研究[J].中文信息学报,2007,21(6):88-94 Tang H F,Tan S B,Cheng X Q.Research on Sentiment Classification of Chinese Reviews Based on Supervised Machine Learning Techniques[J].Journal of Chinese Information Processing,2007,1(6):88-94
[17] Liu Bing,Hu Min-qing,Cheng Jun-sheng.Opinion Observer:Analyzing and Comparing Opinions on the web[C]∥the 14th International Conference on World Wide Web.Chiba,Japan,2005:342-351
[18] 周城,葛斌,唐九阳,等.基于相关性和冗余度的联合特征选择方法[J].计算机科学,2012,39(4):181-184 Zhou C,Ge B,Tang J Y,et al.Joint Feature Selection Method Based on Relevance and Redundancy[J].2012,9(4):181-184
[19] 情感评论语料[EB/OL].http://www.searchforum.org.cn/tansongbo/senti_corpus.jsp Semantic Comment Corpus [EB/OL].http://www.searchforum.org.cn/tansongbo/senti_corpus.jsp
[20] 张启蕊,董守斌,张凌.文本分类的性能评估指标[J].广西师范大学学报:自然科学版,2007,25(2):119-122 Zhang Q R,Dong S B,Zhang L.Performance Evaluation Metric for Text Classifiers[J].Journal of Guangxi Normal University:Natural Science Edition,2007,5(2):119-122
[21] 王卫玲,刘培玉,初建崇.一种改进的基于条件互信息的特征选择算法[J].计算机应用,2007,27(2):433-435 Wang W L,Liu P Y,Chu J C.Improved feature selection algorithm with conditional mutual information[J].Journal of Computer Applications,2007,7(2):433-435
[22] Platt J C.Fast Training of Support Vector Machines Using Sequential Minimal Optimization[M]∥Schoelkopf B,Burges C,Smola A.Advances in Kernel Methods.Cambridge,USA:MIT Press,1999:185-208

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] LEI Li-hui and WANG Jing. Parallelization of LTL Model Checking Based on Possibility Measure[J]. Computer Science, 2018, 45(4): 71 -75, 88 .
[2] XIA Qing-xun and ZHUANG Yi. Remote Attestation Mechanism Based on Locality Principle[J]. Computer Science, 2018, 45(4): 148 -151, 162 .
[3] LI Bai-shen, LI Ling-zhi, SUN Yong and ZHU Yan-qin. Intranet Defense Algorithm Based on Pseudo Boosting Decision Tree[J]. Computer Science, 2018, 45(4): 157 -162 .
[4] WANG Huan, ZHANG Yun-feng and ZHANG Yan. Rapid Decision Method for Repairing Sequence Based on CFDs[J]. Computer Science, 2018, 45(3): 311 -316 .
[5] SUN Qi, JIN Yan, HE Kun and XU Ling-xuan. Hybrid Evolutionary Algorithm for Solving Mixed Capacitated General Routing Problem[J]. Computer Science, 2018, 45(4): 76 -82 .
[6] ZHANG Jia-nan and XIAO Ming-yu. Approximation Algorithm for Weighted Mixed Domination Problem[J]. Computer Science, 2018, 45(4): 83 -88 .
[7] WU Jian-hui, HUANG Zhong-xiang, LI Wu, WU Jian-hui, PENG Xin and ZHANG Sheng. Robustness Optimization of Sequence Decision in Urban Road Construction[J]. Computer Science, 2018, 45(4): 89 -93 .
[8] LIU Qin. Study on Data Quality Based on Constraint in Computer Forensics[J]. Computer Science, 2018, 45(4): 169 -172 .
[9] ZHONG Fei and YANG Bin. License Plate Detection Based on Principal Component Analysis Network[J]. Computer Science, 2018, 45(3): 268 -273 .
[10] SHI Wen-jun, WU Ji-gang and LUO Yu-chun. Fast and Efficient Scheduling Algorithms for Mobile Cloud Offloading[J]. Computer Science, 2018, 45(4): 94 -99, 116 .