计算机科学 ›› 2018, Vol. 45 ›› Issue (12): 142-147.doi: 10.11896/j.issn.1002-137X.2018.12.022

• 人工智能 • 上一篇    下一篇

基于主题增强的递归自编码情感分类研究

朱引, 黄海燕   

  1. (华东理工大学信息科学与工程学院 上海200237)
  • 收稿日期:2017-11-07 出版日期:2018-12-15 发布日期:2019-02-25
  • 作者简介:朱 引(1993-),男,硕士生,主要研究方向为机器学习、文本挖掘,E-mail:yami.zhu@foxmail.com;黄海燕(1972-),女,博士,副教授,主要研究方向为控制与优化复杂工业过程建模,E-mail:huanghong@ecust.edu.cn(通信作者)。

Study on Recursive Auto-encoding Sentiment Classification Based on Topic Enhancement

ZHU Yin, HUANG Hai-yan   

  1. (School of Information Science and Engineering,East China University of Science and Technology,Shanghai 200237,China)
  • Received:2017-11-07 Online:2018-12-15 Published:2019-02-25

摘要: 中文文本情感分析旨在发现用户对事物、事件的情感倾向,然而现有研究往往忽视了文本之间的相互联系。提出一种基于主题增强的递归自编码情感分类模型,通过将文本的主题信息融入到递归自编码模型中,使得该模型可以更深层次地考虑文本的内容信息,提高其对文本情感的理解和泛化能力。在COAE2014数据集上的实验结果表明,将所提分类模型用于情感分类任务时可获得更优的分类效果,证实了其在实际问题中的适用性与可行性。

关键词: 递归自编码, 情感分类, 数据挖掘, 主题模型

Abstract: The emotional analysis of Chinese text aims to discover the emotional tendencies of users to things and events,however,the existing studies often neglect the interrelationships between texts.In light of this,this paper proposed a recursive auto-encoding classification model based on topic enhancement.By incorporating the subject information of the text into the recursive auto-encoding model,this model can further consider the content information of the text and improve the capability to understand the text emotion and generaliza ability.The experimental results on the COAE2014 dataset show that the proposed classification model can achieve better classification performance when used for tasks of sentiment classification,thus verifying its applicability and feasibility in practical problems.

Key words: Data mining, Recursive auto-encoder, Sentiment classification, Topic model

中图分类号: 

  • TP391
[1]ZHANG Z Q,YE Q,LI Y J,et al.Literature review on sentiment analysis of online product reviews [J].Journal of Management Sciences in China,2010,13(6):84-96.(in Chinese)
张紫琼,叶强,李一军,等.互联网商品评论情感分析研究综述[J].管理科学学报,2010,13(6):84-96.
[2]赵军,许洪波,黄萱菁.中文倾向性分析评测技术报告[EB/OL].http://www.doc88.com/p-179806395884.html.
[3]BO P,LEE L.Seeing stars:exploiting class relationships for sentiment categorization with respect to rating scales[C]∥Meeting on Association for Computational Linguistics.2005:115-124.
[4]ZHOU S C,QU W T,SHI Y Z,et al.Overview on sentiment analysis of Chinese microblogging[J].Computer Applications and Software,2013,30(3):161-164.(in Chinese)
周胜臣,瞿文婷,石英子,等.中文微博情感分析研究综述[J].计算机应用与软件,2013,30(3):161-164.
[5]TURNEY P D.Thumbs up or thumbs down:semantic orientation applied to unsupervised classification of reviews[C]∥Mee-ting on Association for Computational Linguistics.2002:417-424.
[6]LUO Y,LI L,TAN S B,et al.Sentiment analysis on Chinese Micro-blog corpus[J].Journal of Shandong University (Natural Science),2014,49(11):1-7.(in Chinese)
罗毅,李利,谭松波,等.基于中文微博语料的情感倾向性分析[J].山东大学学报(理学版),2014,49(11):1-7.
[7]WANG S,MANNING C D.Baselines and bigrams:simple,good sentiment and topic classification[C]∥Meeting of the Association for Computational Linguistics:Short Papers.Association for Computational Linguistics,2012:90-94.
[8]SOCHER R,PENNINGTON J,HUANG E H,et al.Semi-supervised recursive auto-encoders for predicting sentiment distributions[C]∥Conference on Empirical Methods in Natural Language Processing.DBLP,2011:151-161.
[9]TANG D,QIN B,LIU T.Document Modeling with Gated Recurrent Neural Network for Sentiment Classification[C]∥Conference on Empirical Methods in Natural Language Processing.2015:1422-1432.
[10]ZHANG L,CHEN C.Sentiment Classification with Convolu-tional Neural Networks:An Experimental Study on a Large-Scale Chinese Conversation Corpus[C]∥InternationalConfe-rence on Computational Intelligence and Security.IEEE,2017:165-169.
[11]LIANG J,CHAI Y M,YUAN H B,et al.Deep Learning for Chinese Micro-blog Sentiment Analysis[J].Journal of Chinese Information Processing,2014,28(5):155-161.(in Chinese)
梁军,柴玉梅,原慧斌,等.基于深度学习的微博情感分析[J].中文信息学报,2014,28(5):155-161.
[12]TANG D,WEI F,YANG N,et al.Learning Sentiment-Specific Word Embedding for Twitter Sentiment Classification[C]∥Meeting of the Association for Computational Linguistics.2014:1555-1565.
[13]LEVY O,GOLDBERG Y.Neural word embedding as implicit matrix factorization[J].Advances in Neural Information Processing Systems,2014,3(4):2177-2185.
[14]BLEI D M,NG A Y,JORDAN M I.Latent dirichletallocation[J].Journal of Machine Learning Research,2003,3(6):993-1022.
[15]GRIFFITHS T L,STEYVERS M.Finding scientific topics[J].Proceedings of the National Academy of Sciences of the United States of America,2004,1011(1):5228.
[16]ZHANG S,ZHANG C,YOU Z,et al.Asynchronous stochastic gradient descent for DNN training[C]∥IEEE International Conference on Acoustics,Speech and Signal Processing.IEEE,2013:6660-6663.
[17]XIAO H,MAO M Y.NLPIR Chinese character segmentationsystem based Chinese character segmentation tool:CN 106354714 A[P].2017.(in Chinese)
肖红,毛明扬.一种基于nlpir中文分词系统的中文分词工具:CN 106354714 A[P].2017.
[18]ORDENTLICH E,YANG L,FENG A,et al.Network-Efficient Distributed Word2vec Training System for Large Vocabularies[J/OL].https://arxiv.org.abs/606.08495.
[1] 黎嵘繁, 钟婷, 吴劲, 周帆, 匡平.
基于时空注意力克里金的边坡形变数据插值方法
Spatio-Temporal Attention-based Kriging for Land Deformation Data Interpolation
计算机科学, 2022, 49(8): 33-39. https://doi.org/10.11896/jsjkx.210600161
[2] 林夕, 陈孜卓, 王中卿.
基于不平衡数据与集成学习的属性级情感分类
Aspect-level Sentiment Classification Based on Imbalanced Data and Ensemble Learning
计算机科学, 2022, 49(6A): 144-149. https://doi.org/10.11896/jsjkx.210500205
[3] 么晓明, 丁世昌, 赵涛, 黄宏, 罗家德, 傅晓明.
大数据驱动的社会经济地位分析研究综述
Big Data-driven Based Socioeconomic Status Analysis:A Survey
计算机科学, 2022, 49(4): 80-87. https://doi.org/10.11896/jsjkx.211100014
[4] 李浩, 张兰, 杨兵, 杨海潇, 寇勇奇, 王飞, 康雁.
融合双重权重机制和图卷积神经网络的微博细粒度情感分类
Fine-grained Sentiment Classification of Chinese Microblogs Combining Dual Weight Mechanismand Graph Convolutional Neural Network
计算机科学, 2022, 49(3): 246-254. https://doi.org/10.11896/jsjkx.201200073
[5] 潘志豪, 曾碧, 廖文雄, 魏鹏飞, 文松.
基于交互注意力图卷积网络的方面情感分类
Interactive Attention Graph Convolutional Networks for Aspect-based Sentiment Classification
计算机科学, 2022, 49(3): 294-300. https://doi.org/10.11896/jsjkx.210100180
[6] 孔钰婷, 谭富祥, 赵鑫, 张正航, 白璐, 钱育蓉.
基于差分隐私的K-means算法优化研究综述
Review of K-means Algorithm Optimization Based on Differential Privacy
计算机科学, 2022, 49(2): 162-173. https://doi.org/10.11896/jsjkx.201200008
[7] 张亚迪, 孙悦, 刘锋, 朱二周.
结合密度参数与中心替换的改进K-means算法及新聚类有效性指标研究
Study on Density Parameter and Center-Replacement Combined K-means and New Clustering Validity Index
计算机科学, 2022, 49(1): 121-132. https://doi.org/10.11896/jsjkx.201100148
[8] 马董, 李新源, 陈红梅, 肖清.
星型高影响的空间co-location模式挖掘
Mining Spatial co-location Patterns with Star High Influence
计算机科学, 2022, 49(1): 166-174. https://doi.org/10.11896/jsjkx.201000186
[9] 霍帅, 庞春江.
基于Transformer和多通道卷积神经网络的情感分析研究
Research on Sentiment Analysis Based on Transformer and Multi-channel Convolutional Neural Network
计算机科学, 2021, 48(6A): 349-356. https://doi.org/10.11896/jsjkx.200800004
[10] 胡潇炜, 陈羽中.
一种结合自编码器与强化学习的查询推荐方法
Query Suggestion Method Based on Autoencoder and Reinforcement Learning
计算机科学, 2021, 48(6A): 206-212. https://doi.org/10.11896/jsjkx.200900196
[11] 徐慧慧, 晏华.
基于相对危险度的儿童先心病风险因素分析算法
Relative Risk Degree Based Risk Factor Analysis Algorithm for Congenital Heart Disease in Children
计算机科学, 2021, 48(6): 210-214. https://doi.org/10.11896/jsjkx.200500082
[12] 张岩金, 白亮.
一种基于符号关系图的快速符号数据聚类算法
Fast Symbolic Data Clustering Algorithm Based on Symbolic Relation Graph
计算机科学, 2021, 48(4): 111-116. https://doi.org/10.11896/jsjkx.200800011
[13] 张寒烁, 杨冬菊.
基于关系图谱的科技数据分析算法
Technology Data Analysis Algorithm Based on Relational Graph
计算机科学, 2021, 48(3): 174-179. https://doi.org/10.11896/jsjkx.191200154
[14] 邹承明, 陈德.
高维大数据分析的无监督异常检测方法
Unsupervised Anomaly Detection Method for High-dimensional Big Data Analysis
计算机科学, 2021, 48(2): 121-127. https://doi.org/10.11896/jsjkx.191100141
[15] 陈千, 车苗苗, 郭鑫, 王素格.
一种循环卷积注意力模型的文本情感分类方法
Recurrent Convolution Attention Model for Sentiment Classification
计算机科学, 2021, 48(2): 245-249. https://doi.org/10.11896/jsjkx.200100078
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!