计算机科学 ›› 2020, Vol. 47 ›› Issue (1): 231-236.doi: 10.11896/jsjkx.181102130
王立志1,慕晓冬1,刘宏岚2
WANG Li-zhi1,MU Xiao-dong1,LIU Hong-lan2
摘要: 近年来,随着网络用户量的不断增加,用户评论数量也呈爆炸式增长,伴随而来的是大量可用于参考和深度挖掘的信息,文本情感分类应运而生。分类模型的预测精度和执行速度是衡量模型优劣的关键。使用传统的SVM进行文本情感分类,算法简单,易于实现,但其模型参数决定了分类准确率。针对这种情况,文中将改进粒子群优化算法与SVM分类方法相结合,采用了改进粒子群算法优化的SVM方法对影视剧评论的情感进行了研究分析。首先,通过网络爬虫获取豆瓣电影评论数据,将数据预处理后利用加权word2vec向量化文本信息,将其作为支持向量机可识别的输入;然后,使用自适应惯性递减策略并引入交叉算子来改进粒子群算法,并对SVM模型的损失函数、惩罚参数及核函数的参数进行优化;最后,实现文本的情感分类。在同一数据集上的实验结果表明,所提方法有效规避了传统的情感词典方法受词语顺序和不同语境影响的缺陷及使用卷积出现梯度消失或弥散的问题,同时也克服了粒子群算法易陷入局部最优的不足。相较于其他方法,所提分类模型的执行速度更快,有效地提高了分类准确率。
中图分类号:
[1]冯志伟.自然语言处理简明教程[M].上海:上海外语教育出版社,2012. [2]KAUR H,MANGAT V,NIDHI.A survey of sentiment analysis techniques[C]∥International Conference on I-Smac.IEEE,Palladam,India,2017:921-925. [3]DAVE,KUSHAL,LAWRENCE,et al.Mining the peanut gallery:opinion extraction and semantic classification of product re-views[C]∥Proceedings of the 12th International Conference on World Wide Web.NewYork:ACM,2003. [4]GO A,BHAYANI R,HUANG L.Twitter sentiment classification using distant supervision[J].Processing,2009,150(12). [5]JOSHI A,BALAMURALI A R,BHATTACHARYYA P,et al.C-Feel-It:A Sentiment Analyzer for Micro-blogs[C]∥International Conference on Networked Computing & Advanced Information Management.IEEE Computer Society,2008:220-225. [6]GAMON M,AUE A,CORSTON-OLIVER S,et al.Pulse:mi- ning customer opinions from free text[C]∥International Symposium on Intelligent Data Analysis.Berlin:Springer-Verlag,2005:121-132. [7]LI S S,HUANG C R,ZHOU G D,et al.Employing personal/impersonal views in supervised and semi-supervised sentiment classification[C]∥Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics.Uppsala:ACL,2010. [8]LI Y G,ZHOU X G,SUN Y,et al.Research and Implementation of Chinese Microblog Sentiment Classification[J].Journal of Software,2017,28(12):3183-3205. [9]JOHNSON R,ZHANG T.Effective Use of Word Order for Text Categorization with Convolutional Neural Networks[J].arXiv:1412.1058. [10]XUE W,LI T.Aspect Based Sentiment Analysis with Gated Convolutional Networks[C]∥Association for Computational Linguistics.Melbourne,Australia,2018:2514-2523. [11]PARUPALLI S,RAO V A,MAMIDI R.BCSAT:A Benchmark Corpus for Sentiment Analysis in Telugu Using Word-level Annotations[C]∥Association for Computational Linguistics.Melbourne,Australia,2018:99-104. [12]ANGELIDIS S,LAPATA M.Multiple Instance Learning Networks for Fine-Grained Sentiment Analysis[C]∥TACL:Transactions of the Association for Computational Linguistics.Melbourne,Australia,2018:17-31. [13]GUI L,HU J,HE Y,et al.A Question Answering Approach to Emotion Cause Extraction[C]∥Empirical Methods in Natural Language Processing.Copenhagen,Denmark,2017:1593-1602. [14]YUAN Z,JASON R,DANIEL G,et al.A Fast,Compact,Accurate Model for Language Identification of Codemixed Text [C]∥EMNLP:Empirical Methods in Natural Language Processing.Brussels,Belgium,2018:328-337. [15]BORDOLOI M,BISWAS S K.Graph-Based Sentiment Analysis Model for E-Commerce Websites’ Data[C]∥CISC:Cognitive Informatics and Soft Computing.Singapore:Springer,2019:453-462. [16]LI R Y,ZHANG W J,ZHOU Z Y.Improved PSO Algorithm and Its Load Distribution Optimization of Hot Strip Mills[J].Computer Science,2018,45(7):214-218,225. [17]KENNEDY J.Particle Swarm Optimization[C]∥Icnn95-international Conference on Neural Networks.IEEE,2002. [18]SHI Y,EBERHART R C.A modified particle swarm optimizer[C]∥Proceedings IEEE Congress on Evolutionary Computation (CEC’98).Anchorage,1998:69-73. [19]KOU X L.Swarm Intelligence Algorithms and Their Application[D].Xi’an:Xidian University,2009. [20]RAPAIC' M R,KANOVIC' .Time-varying PSO-convergence analysis,convergence-related parameterization and new parameter adjustment schemes[J].Information Processing Letters,2009,109(11):548-552. [21]MARTÍNEZ J L F,GARCÍA E.The PSO family:deduction,stochastic analysis and comparison[J].Swarm Intelligence,2009,3(4):245-273. [22]SHI Y,EBERHART R C.A modified particle swarm optimizer[C]∥Proceedings IEEE Congress on Evolutionary Computation (CEC’98).Anchorage,1998:69-73. [23]EBERHART R C,SHI Y.Tracking and optimizing dynamic systems with particle swarms[C]∥Congress on Evolutionary Computation.IEEE,2001. [24]SHI Y,EBERHART R C.Empirical study of particle swarm optimization[C]∥Congress on Evolutionary Computation.Washi-ngton:IEEE,2002. [25]LIANG J J,QU B Y,SUGANTHAN P N.Problem definitions and evaluation criteria for the CEC 2014 special session and competition on single objective real-parameter numerical optimization[R].Technical Report 201311,2013. |
[1] | 赵冬梅, 吴亚星, 张红斌. 基于IPSO-BiLSTM的网络安全态势预测 Network Security Situation Prediction Based on IPSO-BiLSTM 计算机科学, 2022, 49(7): 357-362. https://doi.org/10.11896/jsjkx.210900103 |
[2] | 刘漳辉, 郑鸿强, 张建山, 陈哲毅. 多无人机使能移动边缘计算系统中的计算卸载与部署优化 Computation Offloading and Deployment Optimization in Multi-UAV-Enabled Mobile Edge Computing Systems 计算机科学, 2022, 49(6A): 619-627. https://doi.org/10.11896/jsjkx.210600165 |
[3] | 丁锋, 孙晓. 基于注意力机制和BiLSTM-CRF的消极情绪意见目标抽取 Negative-emotion Opinion Target Extraction Based on Attention and BiLSTM-CRF 计算机科学, 2022, 49(2): 223-230. https://doi.org/10.11896/jsjkx.210100046 |
[4] | 袁景凌, 丁远远, 盛德明, 李琳. 基于视觉方面注意力的图像文本情感分析模型 Image-Text Sentiment Analysis Model Based on Visual Aspect Attention 计算机科学, 2022, 49(1): 219-224. https://doi.org/10.11896/jsjkx.201000074 |
[5] | 胡艳丽, 童谭骞, 张啸宇, 彭娟. 融入自注意力机制的深度学习情感分析方法 Self-attention-based BGRU and CNN for Sentiment Analysis 计算机科学, 2022, 49(1): 252-258. https://doi.org/10.11896/jsjkx.210600063 |
[6] | 戴宏亮, 钟国金, 游志铭, 戴宏明. 基于Spark的舆情情感大数据分析集成方法 Public Opinion Sentiment Big Data Analysis Ensemble Method Based on Spark 计算机科学, 2021, 48(9): 118-124. https://doi.org/10.11896/jsjkx.210400280 |
[7] | 张瑾, 段利国, 李爱萍, 郝晓燕. 基于注意力与门控机制相结合的细粒度情感分析 Fine-grained Sentiment Analysis Based on Combination of Attention and Gated Mechanism 计算机科学, 2021, 48(8): 226-233. https://doi.org/10.11896/jsjkx.200700058 |
[8] | 屈立成, 吕娇, 屈艺华, 王海飞. 基于模糊神经网络的运动目标智能分配定位算法 Intelligent Assignment and Positioning Algorithm of Moving Target Based on Fuzzy Neural Network 计算机科学, 2021, 48(8): 246-252. https://doi.org/10.11896/jsjkx.200600050 |
[9] | 史伟, 付月. 考虑语境的微博短文本挖掘:情感分析的方法 Microblog Short Text Mining Considering Context:A Method of Sentiment Analysis 计算机科学, 2021, 48(6A): 158-164. https://doi.org/10.11896/jsjkx.210200089 |
[10] | 潘芳, 张会兵, 董俊超, 首照宇. 基于高效Transformer的中文在线课程评论方面情感分析 Aspect Sentiment Analysis of Chinese Online Course Review Based on Efficient Transformer 计算机科学, 2021, 48(6A): 264-269. https://doi.org/10.11896/jsjkx.200800116 |
[11] | 张明阳, 王刚, 彭起, 张岩峰. 学术论文公开评审平台数据分析 Data Analysis of OpenReview 计算机科学, 2021, 48(6): 63-70. https://doi.org/10.11896/jsjkx.200500138 |
[12] | 尹久, 池凯凯, 宦若虹. 基于ATT-DGRU的文本方面级别情感分析 Aspect-level Sentiment Analysis of Text Based on ATT-DGRU 计算机科学, 2021, 48(5): 217-224. https://doi.org/10.11896/jsjkx.200500076 |
[13] | 李梦荷, 许宏吉, 石磊鑫, 赵文杰, 李娟. 基于骨骼关键点检测的多人行为识别 Multi-person Activity Recognition Based on Bone Keypoints Detection 计算机科学, 2021, 48(4): 138-143. https://doi.org/10.11896/jsjkx.200300042 |
[14] | 李建兰, 潘岳, 李小聪, 刘子维, 王天宇. 基于CiteSpace的中文评论文本研究现状与趋势分析 Chinese Commentary Text Research Status and Trend Analysis Based on CiteSpace 计算机科学, 2021, 48(11A): 17-21. https://doi.org/10.11896/jsjkx.210300172 |
[15] | 杨青, 张亚文, 朱丽, 吴涛. 基于注意力机制和BiGRU融合的文本情感分析 Text Sentiment Analysis Based on Fusion of Attention Mechanism and BiGRU 计算机科学, 2021, 48(11): 307-311. https://doi.org/10.11896/jsjkx.201000075 |
|