计算机科学 ›› 2020, Vol. 47 ›› Issue (11A): 78-82.doi: 10.11896/jsjkx.200400061
景丽, 李曼曼, 何婷婷
JING Li, LI Man-man, HE Ting-ting
摘要: 在高速发展的互联网时代,网络评论情感分析对分析舆情、监控电商有着重要作用。现有分类方法主要有情感词典方法和机器学习方法。情感词典方法过于依赖词典中的情感词,情感词典越完备,网络评论情感倾向越显著,分类效果越好,但对于情感倾向不易区分的评论,其分类效果欠佳。机器学习方法是一种有监督的方法,其分类效果依赖于大量事先标注的语料,目前语料标注是通过人工完成,工作量极大。文中综合了情感词典和机器学习两种方法的特点,构建了一个网络评论情感分类模型,利用相关领域网络评论对情感词典进行扩充,基于情感词典方法的分类结果,通过自监督学习训练一个分类器,进而提高情感倾向模糊文本的分类正确率。实验表明,与情感词典方法和机器学习方法相比,所提模型在酒店评论、京东评论两个数据集上都获得了更好的情感分类效果。
中图分类号:
[1] HONG W,LI M.A Summary of Research on Text Sentiment Analysis Methods[J].Computer Engineering & Science,2019,41(4):750-757. [2] QIU L,ZHANG W,HU C,et al.Selc:a self-supervised model for sentiment classification[C]//Proceedings of the 18th ACM Conference on Information and Knowledge Management.2009:929-936. [3] HATZIVASSILOGLOU V,MCKEOWNC K R.Predicting thesemantic orientation of adjectives[C]//Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics.Association for Computational Linguistics,1997:174-181. [4] WIEBE J.Learning subjective adjectives from corpora[C]//Proceedings of the 17th National Conference on Artificial Intelligence.Menlo Park,CA:AAAI Press,2000:735-741. [5] TURNEY P D,LITTMAN M L.Measuring praise and criti-cism:Inference of semantic orientation from association[J].ACM Transactions on Information Systems (TOIS),2003,21(4):315-346. [6] LI S S,LI Y W,HUANG J R,et al.Construction method of Chinese sentiment dictionary based on bilingual information and label propagation algorithm[J].Journal of Chinese Information Processing,2013,27(6):75-81. [7] WANG Z T,YU Z W,GUO B,et al.Sentiment Analysis of Chinese Weibo Based on Dictionary and Rule Set[J].Computer Engineering and Applications,2015,51(8):218-225. [8] FAN Z,GUO Y,ZHANG Z H.et al.Sentiment analysis of movie reviews based on dictionaries and weakly annotated information[J].Journal of Computer Applications,2018,38(11):3084-3088. [9] RADOVANO M,IVANOVI M.Interactions between document representation and feature selection in text categorization[C]//International Conference on Database and Expert Systems Applications.Springer,Berlin,Heidelberg,2006:489-498. [10] JHA V,SAVITHA R,SHENOY P D,et al.A novel sentiment aware dictionary for multi-domain sentiment classification[J].Computers & Electrical Engineering,2018,69:585-597. [11] PANG B,LEE L,VAITHYANATHAN S.Sentiment classification using machine learning techniques[C]//Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing(EMNLP).2002:79-86. [12] PALTOGLOU G,THEWALL M.A study of information re-trieval weighting schemes for sentiment analysis[C]//Procee-dings of the 48th Annual Meeting of the Association for Computational Linguistics.Association for Computational Linguistics,2010:1386-1395. [13] TRIPATHY A,AGRAWAL A,RATH S K.Classification ofsentiment reviews using n-gram machine learning approach[J].Expert Systems with Applications,2016,57:117-126. [14] MIKOLOV T,SUTSKEVER I,CHEN K,et al.Distributed representations of words and phrases and their compositionality[C]//Advances in neural information processing systems.2013:3111-3119. [15] WEI G S,WU K C.Sentiment analysis based on word vectormodel[J].Computer Systems & Applications,2017(3):184-188. [16] WANG M Y,WU H,JIA X T.Research on Multi-EmotionClassification of Weibo Based on Word2vec and Extended Emotion Dictionary[J].Journal of Northeast Normal University(Natural Science Edition),2019,51(1):55-62. [17] TANG X B,WANG H Y.Research on Weibo Product Reviews Mining Model[J].Journal of Intelligence,2013,32(2):107-111,127. [18] TAN S B.Hotel review corpus [EB/OL].[2020-03-17].https://www.aitechclub.com/data-detail?data_id=29. |
[1] | 冷典典, 杜鹏, 陈建廷, 向阳. 面向自动化集装箱码头的AGV行驶时间估计 Automated Container Terminal Oriented Travel Time Estimation of AGV 计算机科学, 2022, 49(9): 208-214. https://doi.org/10.11896/jsjkx.210700028 |
[2] | 宁晗阳, 马苗, 杨波, 刘士昌. 密码学智能化研究进展与分析 Research Progress and Analysis on Intelligent Cryptology 计算机科学, 2022, 49(9): 288-296. https://doi.org/10.11896/jsjkx.220300053 |
[3] | 何强, 尹震宇, 黄敏, 王兴伟, 王源田, 崔硕, 赵勇. 基于大数据的进化网络影响力分析研究综述 Survey of Influence Analysis of Evolutionary Network Based on Big Data 计算机科学, 2022, 49(8): 1-11. https://doi.org/10.11896/jsjkx.210700240 |
[4] | 李瑶, 李涛, 李埼钒, 梁家瑞, Ibegbu Nnamdi JULIAN, 陈俊杰, 郭浩. 基于多尺度的稀疏脑功能超网络构建及多特征融合分类研究 Construction and Multi-feature Fusion Classification Research Based on Multi-scale Sparse Brain Functional Hyper-network 计算机科学, 2022, 49(8): 257-266. https://doi.org/10.11896/jsjkx.210600094 |
[5] | 张光华, 高天娇, 陈振国, 于乃文. 基于N-Gram静态分析技术的恶意软件分类研究 Study on Malware Classification Based on N-Gram Static Analysis Technology 计算机科学, 2022, 49(8): 336-343. https://doi.org/10.11896/jsjkx.210900203 |
[6] | 侯钰涛, 阿布都克力木·阿布力孜, 哈里旦木·阿布都克里木. 中文预训练模型研究进展 Advances in Chinese Pre-training Models 计算机科学, 2022, 49(7): 148-163. https://doi.org/10.11896/jsjkx.211200018 |
[7] | 姜胜腾, 张亦弛, 罗鹏, 刘月玲, 曹阔, 赵海涛, 魏急波. 语义通信系统的性能度量指标分析 Analysis of Performance Metrics of Semantic Communication Systems 计算机科学, 2022, 49(7): 236-241. https://doi.org/10.11896/jsjkx.211200071 |
[8] | 陈明鑫, 张钧波, 李天瑞. 联邦学习攻防研究综述 Survey on Attacks and Defenses in Federated Learning 计算机科学, 2022, 49(7): 310-323. https://doi.org/10.11896/jsjkx.211000079 |
[9] | 肖治鸿, 韩晔彤, 邹永攀. 基于多源数据和逻辑推理的行为识别技术研究 Study on Activity Recognition Based on Multi-source Data and Logical Reasoning 计算机科学, 2022, 49(6A): 397-406. https://doi.org/10.11896/jsjkx.210300270 |
[10] | 姚烨, 朱怡安, 钱亮, 贾耀, 张黎翔, 刘瑞亮. 一种基于异质模型融合的 Android 终端恶意软件检测方法 Android Malware Detection Method Based on Heterogeneous Model Fusion 计算机科学, 2022, 49(6A): 508-515. https://doi.org/10.11896/jsjkx.210700103 |
[11] | 李亚茹, 张宇来, 王佳晨. 面向超参数估计的贝叶斯优化方法综述 Survey on Bayesian Optimization Methods for Hyper-parameter Tuning 计算机科学, 2022, 49(6A): 86-92. https://doi.org/10.11896/jsjkx.210300208 |
[12] | 赵璐, 袁立明, 郝琨. 多示例学习算法综述 Review of Multi-instance Learning Algorithms 计算机科学, 2022, 49(6A): 93-99. https://doi.org/10.11896/jsjkx.210500047 |
[13] | 林夕, 陈孜卓, 王中卿. 基于不平衡数据与集成学习的属性级情感分类 Aspect-level Sentiment Classification Based on Imbalanced Data and Ensemble Learning 计算机科学, 2022, 49(6A): 144-149. https://doi.org/10.11896/jsjkx.210500205 |
[14] | 王飞, 黄涛, 杨晔. 基于Stacking多模型融合的IGBT器件寿命的机器学习预测算法研究 Study on Machine Learning Algorithms for Life Prediction of IGBT Devices Based on Stacking Multi-model Fusion 计算机科学, 2022, 49(6A): 784-789. https://doi.org/10.11896/jsjkx.210400030 |
[15] | 许杰, 祝玉坤, 邢春晓. 机器学习在金融资产定价中的应用研究综述 Application of Machine Learning in Financial Asset Pricing:A Review 计算机科学, 2022, 49(6): 276-286. https://doi.org/10.11896/jsjkx.210900127 |
|