计算机科学 ›› 2019, Vol. 46 ›› Issue (7): 172-179.doi: 10.11896/j.issn.1002-137X.2019.07.027

• 人工智能 • 上一篇    下一篇

基于强化表征学习深度森林的文本情感分类

韩慧1,王黎明1,柴玉梅1,刘箴2   

  1. (郑州大学信息工程学院 郑州450001)1
    (宁波大学信息科学与工程学院 浙江 宁波315211)2
  • 收稿日期:2018-06-12 出版日期:2019-07-15 发布日期:2019-07-15
  • 作者简介:韩 慧(1992-),女,硕士,主要研究方向为自然语言处理,E-mail:18337149649@163.com;王黎明(1963-),男,博士,教授,CCF高级会员,主要研究方向为现代软件工程技术、分布式人工智能、数据挖掘等,E-mail:ielmwang@zzu.edu.cn(通信作者);柴玉梅(1964-),女,硕士,教授,主要研究方向为机器学习、数据挖掘和自然语言处理;刘 箴(1965-),男,博士,研究员,主要研究方向为虚拟现实、情感计算。
  • 基金资助:
    国家自然科学基金项目(U1636111)资助

Text Sentiment Classification Based on Deep Forests with Enhanced Features

HAN Hui1,WANG Li-ming1,CHAI Yu-mei1,LIU Zhen2   

  1. (School of Information Engineering,Zhengzhou University,Zhengzhou 450001,China)1
    (School of Information Science and Technology,Ningbo University,Ningbo,Zhejiang 315211,China)2
  • Received:2018-06-12 Online:2019-07-15 Published:2019-07-15

摘要: 为了有效实现评论文本的情感倾向性预测,在深度森林模型的基础上提出一种基于强化表征学习的深度森林算法BFDF(Boosting Feature of Deep Forest)来对文本进行情感分类。首先,提取二元特征与情感语义概率特征;其次,对二元特征中的评价对象做聚类处理以及特征融合;然后,改进深度森林级联层的表征学习能力,避免特征信息逐渐削减;最后,将AdaBoost方法融入到深度森林,使深度森林注意到不同特征的重要性,进而得到改进的模型BFDF。在酒店评论语料集上进行了实验验证,实验结果证明了该方法的有效性。

关键词: AdaBoost, 情感分类, 深度森林, 特征提取

Abstract: To effectively realize the sentiment orientation prediction of the review text,based on the deep forest model,a deep forest algorithm BFDF (Boosting Feature of Deep Forest) was proposed to classify the text.Firstly,the binary features and emotional semantic probability features are extracted.Secondly,the evaluation objects in the binary features are clustered and made features fusion.Then,the deep forest cascade characterization learning ability is improved toavoid the gradual reduction of feature information.Finally,the AdaBoost method is integrated into the deep forest,so that the deep forest notices the importance of different features,and the improved model BFDF is obtained.The experimental results on the hotel commentary corpus demonstrate the effectiveness of the proposed method.

Key words: AdaBoost, Deep forest, Feature extraction, Sentiment classification

中图分类号: 

  • TP391
[1]XU J F,XU Y,XU Y C,et al.Hybrid Algorithm Framework for Sentiment Classification of Chinese Based on Semantic Comprehension and Machine Learning[J].Computer Science,2015,42(6):61-66.(in Chinese)
徐健锋,许园,许元辰,等.基于语义理解和机器学习的混合的中文文本情感分类算法框架[J].计算机科学,2015,42(6):61-66.
[2]ZHANG D,XU H,SU Z,et al.Chinese comments sentiment classification based on word2vec and SVM perf[J].Expert Systems with Applications,2015,42(4):1857-1863.
[3]WU Y J,ZHU F X,ZHOU J.Using Probabilistic Graphical Model for Text Sentiment Analysis[J].Journal of Chinese Computer System,2015,36(7):1421-1425.(in Chinese)
吴钰洁,朱福喜,周竞.基于概率图模型的文本情感分析[J].小型微型计算机系统,2015,36(7):1421-1425.
[4]JO Y,OH A H.Aspect and sentiment unification model for online review analysis[C]∥ACM International Conference on Web Search and Data Mining.ACM,2011:815-824.
[5]ZHOU Z H,FENG J.Deep Forest:Towards An Alternative to Deep Neural Networks[J].arXiv:1702.08835v1,2017:2-3.
[6]ZHAO Q,WANG H,LV P,et al.A Bootstrapping Based Refinement Framework for Mining Opinion Words and Targets[C]∥Proceedings of the 23rd ACM International Conference on Information and Knowledge Management.2014:1995-1998.
[7]LIU K,XU L,ZHAO J.Co-Extracting Opinion Targets and Opinion Words from Online Reviews Based on the Word Alignment Model[J].IEEE Transactions on Knowledge & Data Engineering,2015,27(3):636-650.
[8]GU Z J,YAO T.Extraction and Discrimination of the Evaluated Object and Its Orientation[J].Journal of Chinese Information Processing,2012,26(4):91-97.
[9]KAMAL A,ABULAISH M,ANWAR T.Mining feature-opinion pairs and their reliability scores from web opinion sources[C]∥International Conference on Web Intelligence,Mining and Semantics.ACM,2012:15.
[10]WANG S G,WU S H.Feature-Opinion Extraction in Science Spots Reviews Based on Dependency Relation[J].Journal of Chinese Information Processing,2012,26(3):116-121.(in Chinese)
王素格,吴苏红.基于依存关系的旅游景点评论的特征—观点对抽取[J].中文信息学报,2012,26(3):116-121.
[11]SARU,KETKI B M.A new approach towards co-extracting opinion-targets and opinion words from online reviews[C]∥International Conference on Computational Intelligence & Communication Technology.IEEE,2017:1-4.
[12]FENG S,FU Y C,YANG F,et al.Blog Sentiment Orientation Analysis Based on Dependency Parsing[J].Journal of Chinese Information Processing,2012,49(11):2395-240 6.(in Chinese)
冯时,付永陈,阳锋,等.基于依存句法的博文情感倾向分析研究[J].计算机研究与发展,2012,49(11):2395-2406.
[13]XIAO H,XU S H.Analysis on Web Public Opinion Orientation Based on Syntactic Parsing and Emotional Dictionary[J].Journal of Chinese Computer Systems,2014,35(4):811-813.
[14]UTKIN L V,RYABININ M A.A Siamese Deep Forest[J]. Journal of Knowledge-Based Systems,arXiv:1704.08715v1,2017:5-6.
[15]ZHU X Y.Application of Deep Forest Model for Flame Detection[D].Wuxi:Jiangnan University,2017.(in Chinese)
朱晓妤.应用深度森林模型的火焰检测[D].无锡:江南大学,2017.
[16]WANG H Y.Dense Adaptive Cascade Forest:A Densely Con- nected Deep Ensemble for Classification Problems[J].arXiv:1804.10885v1,2018:6-9.
[17]UTKIN L V,RYABININ M A.Discriminative Metric Learning with Deep Forest[J].arXiv:1705.09620v1,2017:4-8.
[18]YANG F,XU Q,LI B,et al.Ship Detection From Thermal Remote Sensing Imagery Through Region-Based Deep Forest[J].IEEE Geoscience & Remote Sensing Letters,2018,15(3):449-453.
[19]SHI X.Research on Sentiment Classification Based on Semantic Lexicon of Hotel Field[D].Baoding:Hebei University,2014.(in Chinese)
石馨.基于酒店领域情感词典的分类器研究[D].保定:河北大学,2014.
[20]DU C S,HUANG L.Sentiment Analysis with Piecewise Convolution Neural Network[J].Computer Engineering & Science,2017,39(1):173-179.(in Chinese)
杜昌顺,黄磊.分段卷积神经网络在文本情感分析中的应用[J].计算机工程与科学,2017,39(1):173-179.
[21]CHEN N N.Text Sentiment Analysis based on Deep Learning Methods[D].Hangzhou:Zhejiang Gongshang University,2017.(in Chinese)
陈南南.基于深度学习的文本情感分析技术研究[D].杭州:浙江工商大学,2017.
[22]HAO J.Research on Text Sentiment Analysis Based on Topic Model[D].Taiyuan:Taiyuan University of Technology,2017.(in Chinese)
郝洁.基于主题模型的文本情感分析研究[D].太原:太原理工大学,2017.
[23]WANG N N.Research on Sentiment Orientation Technology for Review Texts[D].Beijing:Beijing Jiaotong University,2017.(in Chinese)
王娜娜.评论文本情感倾向性分析技术研究[D].北京:北京交通大学,2017.
[1] 张源, 康乐, 宫朝辉, 张志鸿.
基于Bi-LSTM的期货市场关联交易行为检测方法
Related Transaction Behavior Detection in Futures Market Based on Bi-LSTM
计算机科学, 2022, 49(7): 31-39. https://doi.org/10.11896/jsjkx.210400304
[2] 曾志贤, 曹建军, 翁年凤, 蒋国权, 徐滨.
基于注意力机制的细粒度语义关联视频-文本跨模态实体分辨
Fine-grained Semantic Association Video-Text Cross-modal Entity Resolution Based on Attention Mechanism
计算机科学, 2022, 49(7): 106-112. https://doi.org/10.11896/jsjkx.210500224
[3] 程成, 降爱莲.
基于多路径特征提取的实时语义分割方法
Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction
计算机科学, 2022, 49(7): 120-126. https://doi.org/10.11896/jsjkx.210500157
[4] 刘伟业, 鲁慧民, 李玉鹏, 马宁.
指静脉识别技术研究综述
Survey on Finger Vein Recognition Research
计算机科学, 2022, 49(6A): 1-11. https://doi.org/10.11896/jsjkx.210400056
[5] 林夕, 陈孜卓, 王中卿.
基于不平衡数据与集成学习的属性级情感分类
Aspect-level Sentiment Classification Based on Imbalanced Data and Ensemble Learning
计算机科学, 2022, 49(6A): 144-149. https://doi.org/10.11896/jsjkx.210500205
[6] 高元浩, 罗晓清, 张战成.
基于特征分离的红外与可见光图像融合算法
Infrared and Visible Image Fusion Based on Feature Separation
计算机科学, 2022, 49(5): 58-63. https://doi.org/10.11896/jsjkx.210200148
[7] 左杰格, 柳晓鸣, 蔡兵.
基于图像分块与特征融合的户外图像天气识别
Outdoor Image Weather Recognition Based on Image Blocks and Feature Fusion
计算机科学, 2022, 49(3): 197-203. https://doi.org/10.11896/jsjkx.201200263
[8] 李浩, 张兰, 杨兵, 杨海潇, 寇勇奇, 王飞, 康雁.
融合双重权重机制和图卷积神经网络的微博细粒度情感分类
Fine-grained Sentiment Classification of Chinese Microblogs Combining Dual Weight Mechanismand Graph Convolutional Neural Network
计算机科学, 2022, 49(3): 246-254. https://doi.org/10.11896/jsjkx.201200073
[9] 潘志豪, 曾碧, 廖文雄, 魏鹏飞, 文松.
基于交互注意力图卷积网络的方面情感分类
Interactive Attention Graph Convolutional Networks for Aspect-based Sentiment Classification
计算机科学, 2022, 49(3): 294-300. https://doi.org/10.11896/jsjkx.210100180
[10] 任首朋, 李劲, 王静茹, 岳昆.
基于集成回归决策树的lncRNA-疾病关联预测方法
Ensemble Regression Decision Trees-based lncRNA-disease Association Prediction
计算机科学, 2022, 49(2): 265-271. https://doi.org/10.11896/jsjkx.201100132
[11] 张师鹏, 李永忠.
基于降噪自编码器和三支决策的入侵检测方法
Intrusion Detection Method Based on Denoising Autoencoder and Three-way Decisions
计算机科学, 2021, 48(9): 345-351. https://doi.org/10.11896/jsjkx.200500059
[12] 冯霞, 胡志毅, 刘才华.
跨模态检索研究进展综述
Survey of Research Progress on Cross-modal Retrieval
计算机科学, 2021, 48(8): 13-23. https://doi.org/10.11896/jsjkx.200800165
[13] 张丽倩, 李孟航, 高珊珊, 张彩明.
面向计算机辅助舌诊关键问题的解决方案综述
Summary of Computer-assisted Tongue Diagnosis Solutions for Key Problems
计算机科学, 2021, 48(7): 256-269. https://doi.org/10.11896/jsjkx.200800223
[14] 暴雨轩, 芦天亮, 杜彦辉, 石达.
基于i_ResNet34模型和数据增强的深度伪造视频检测方法
Deepfake Videos Detection Method Based on i_ResNet34 Model and Data Augmentation
计算机科学, 2021, 48(7): 77-85. https://doi.org/10.11896/jsjkx.210300258
[15] 霍帅, 庞春江.
基于Transformer和多通道卷积神经网络的情感分析研究
Research on Sentiment Analysis Based on Transformer and Multi-channel Convolutional Neural Network
计算机科学, 2021, 48(6A): 349-356. https://doi.org/10.11896/jsjkx.200800004
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!