计算机科学 ›› 2021, Vol. 48 ›› Issue (2): 245-249.doi: 10.11896/jsjkx.200100078

• 人工智能 • 上一篇    下一篇

一种循环卷积注意力模型的文本情感分类方法

陈千1,2, 车苗苗1, 郭鑫1, 王素格1,2   

  1. 1 山西大学计算机与信息技术学院 太原030006
    2 山西大学计算智能与中文信息处理教育部重点实验室 太原030006
  • 收稿日期:2020-01-13 修回日期:2020-06-19 出版日期:2021-02-15 发布日期:2021-02-04
  • 通讯作者: 郭鑫(guoxinjsj@sxu.edu.cn)
  • 作者简介:chenqian@sxu.edu.cn
  • 基金资助:
    国家自然科学基金(61502288,61403238);山西省基础研究计划项目(201901D111032,201701D221101);山西省重点研发计划项目(201803D421024)

Recurrent Convolution Attention Model for Sentiment Classification

CHEN Qian1,2, CHE Miao-miao1, GUO Xin1, WANG Su-ge1,2   

  1. 1 School of Computer & Information Technology,Shanxi University,Taiyuan 030006,China
    2 Key Laboratory of Computational Intelligence and Chinese Information Processing,Ministry of Education,Taiyuan 030006,China
  • Received:2020-01-13 Revised:2020-06-19 Online:2021-02-15 Published:2021-02-04
  • About author:CHEN Qian,born in 1983,Ph.D,asso-ciate professor,master supervisor,is a member of China Computer Federation.His main research interests include to-pic detection and evolution,machine reading comprehension and natural language processing.
    GUO Xin,born in 1982,Ph.D,associate professor,master supervisor,is a member of China Computer Federation.Her main research interests include feature learning and natural language proces-sing.
  • Supported by:
    The National Natural Science Foundation of China(61502288,61403238),Natural Science Foundation of Shanxi Pro-vince(201901D111032,201701D221101) and Key Research and Development Project of Shanxi Province(201803D421024).

摘要: 情感分类对推荐系统、自动问答、阅读理解等下游应用具有重要应用价值,是自然语言处理领域的重要研究方向。情感分类任务直接依赖于上下文,包括全局和局部信息,而现有的神经网络模型无法同时捕获上下文局部信息和全局信息。文中针对单标记和多标记情感分类任务,提出一种循环卷积注意力模型(LSTM-CNN-ATT,LCA)。该模型利用注意力机制融合卷积神经网络(Convolutional Neural Network,CNN)的局部信息提取能力和循环神经网络(Recurrent Neural Network,RNN)的全局信息提取能力,包括词嵌入层、上下文表示层、卷积层和注意力层。对于多标记情感分类任务,在注意力层上附加主题信息,进一步指导多标记情感倾向的精确提取。在两个单标记数据集上的F1指标达到82.1%,与前沿单标记模型相当;在两个多标记数据集上,小数据集实验结果接近基准模型,大数据集上的F1指标达到78.38%,超过前沿模型,表明LCA模型具有较高的稳定性和较强的通用性。

关键词: 卷积神经网络, 情感分类, 循环神经网络, 注意力机制

Abstract: Sentiment classification has important application value for downstream applications,including recommendation system,automatic question answering and reading comprehension.It is an important research direction in the field of natural language processing.The task of sentiment classification depends on global and local information hidden in context.However,exis-ting neural network models can not capture the local and global information of context at the same time.In this paper,a recurrent convolutional attention model (LSTM-CNN-ATT,LCA) is proposed for single label and multi-label sentiment classification tasks.It uses attention mechanism to fuse the local information extraction ability of convolutional neural network and the global information extraction ability of recurrent neural network,including word embedding layer,context representation layer,convolution layer and attention layer.For the multi-label sentiment classification task,the topic information is added to the attention layer to further guide the accurate extraction of multi-label emotion tendency.The F1 index on two single label datasets reaches 82.1%,which is equivalent to the frontier single label model.On two multi-label datasets,the experimental results on small datasets are close to the benchmark model,and the F1 index on large datasets reaches 78.38%,which is higher than the state-of-the-art model.It indicates that LCA model has high stability and strong universality.

Key words: Attention mechanism, Convolutional neural network, Recurrent neural network, Sentiment classification

中图分类号: 

  • TP391
[1] CHUNG J,GULCEHRE C,CHO K,et al.Empirical evaluation of gated recurrent neural networks on sequence modeling[J].arXiv:1412.3555,2014.
[2] YIN W,KANN K,YU M,et al.Comparative study of cnn and rnn for natural language processing[J].arXiv:1702.01923,2017.
[3] DOMINIK S,ANDREAS M,SVEN B.Evaluation of pooling ope-rations in convolutional architectures for object recognition[C]//International Conference on Artificial Neural Networks.2010:92-101.
[4] RONAN C,JASON W,LEON B,et al.Natural language processing (almost) from scratch[J].Journal of Machine Learning Research,2011,12(1):2493-2537.
[5] ZHU J,HASTIE T.Kernel logistic regression and the importvector machine[C]//International Conference on Neural Information Processing Systems:Natural & Synthetic.2001:1081-1088.
[6] CORTES C,VAPNIK V.Support-vector networks[J].Machine Learning,1995,20(3):273-297.
[7] LI Y,WEI B,LIU Y,et al.Incorporating knowledge into neural network for text representation[J].Expert Systems with Applications,2018,96(4):103-114.
[8] KIM Y.Convolutional neural networks for sentence classification[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing.2014:1746-1751.
[9] ALEX G.Generating Sequences with Recurrent Neural Net-works[J].arXiv:1308.0850,2013.
[10] BAHDANAU D,CHO K,BENGIO Y.Neural machine translation by jointly learning to align and translate[J].arXiv:1409.0473,2016.
[11] TANG D,QIN B,LIU T.Document modeling with gated recurrent neural network for sentiment classification[C]//Procee-dings of the 2015 Conference on Empirical Methods in Natural Language Processing.2015:1422-1432.
[12] XU J,CHEN D,QIU X,et al.Cached long short-term memory neural networks for document-level sentiment classification[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing.2016:1660-1669.
[13] TIAN Z,RONG W,SHI L,et al.Attention Aware Bidirectional Gated Recurrent Unit Based Framework for Sentiment Analysis[C]//Proceedings of the 2018 Conference of the International Conference on Knowledge Science,Engineering and Management.Springer.Cham,2018:67-78.
[14] DELVIN J,CHANG M W,LEE K,et al.BERT:Pre-Training of Deep Bidirectional Transformers for Language Understanding[C]//Proceedings of the 2019 Conference of the Annual Confe-rence of the North American Chapter of the Association for Computational Linguistics.2019:4171-4186.
[15] WANG Y,HUANG M,ZHAO L,et al.Attention-based lstm for aspect-level sentiment classification[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing.2016:606-615.
[16] CHENG J,ZHAO S,ZHANG J,et al.Aspect-level sentiment classification with HEAT (hierarchical attention) network[C]//Proceedings of the 2017 ACM on Conference on Information and Knowledge Management.2017:97-106.
[17] XUE W,LI T.Aspect based sentiment analysis with gated con-volutional networks[C]//Proceedings of the 56th Annual Mee-ting of the Association for Computational Linguistics.2018:2514-2523.
[18] JIANG M,ZHANG W,ZHANG M,et al.An LSTM-CNN attention approach for aspect-level sentiment classification[J].Journal of Computational Methods in Sciences and Engineering,2019,(19):859-868.
[19] QUOC L,TOMAS M.Distributed representations of sentences and documents[C]//International Conference on Machine Learning.2014:1188-1196.
[20] JEFFREY P,RICHARD S,CHRISTOPHER M.Glove:Global vectors for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Proces-sing.2014:1532-1543.
[21] KAI S,RICHARD S,CHRISTOPHER M.Improved semantic representations from tree-structured long short-term memory networks[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing,Association for Computational Linguistics (ACL).2015:1556-1566.
[22] RICHARD S,ALEX P,JEAN W,et al.Recursive deep models for semantic compositionality over a sentiment treebank[C]//Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing.2013:1631-1642.
[23] ZHANG R,HONGLAK L,DRAGOMIR R.Dependency sensitive convolutional neural networks for modeling sentences and documents[J].arXiv:1611.02361,2016.
[24] ZHANG T,HUANG M,ZHAO L.Learning structured representation for text classification via reinforcement learning[C]//proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence.New Orleans,Louisiana,USA,2018.
[25] YIN W,SCHUTZE H.Multichannel variable-size convolutionfor sentence classification[J].arXiv: 603.04513,2016.
[1] 饶志双, 贾真, 张凡, 李天瑞.
基于Key-Value关联记忆网络的知识图谱问答方法
Key-Value Relational Memory Networks for Question Answering over Knowledge Graph
计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[2] 周芳泉, 成卫青.
基于全局增强图神经网络的序列推荐
Sequence Recommendation Based on Global Enhanced Graph Neural Network
计算机科学, 2022, 49(9): 55-63. https://doi.org/10.11896/jsjkx.210700085
[3] 戴禹, 许林峰.
基于文本行匹配的跨图文本阅读方法
Cross-image Text Reading Method Based on Text Line Matching
计算机科学, 2022, 49(9): 139-145. https://doi.org/10.11896/jsjkx.220600032
[4] 周乐员, 张剑华, 袁甜甜, 陈胜勇.
多层注意力机制融合的序列到序列中国连续手语识别和翻译
Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion
计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026
[5] 熊丽琴, 曹雷, 赖俊, 陈希亮.
基于值分解的多智能体深度强化学习综述
Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization
计算机科学, 2022, 49(9): 172-182. https://doi.org/10.11896/jsjkx.210800112
[6] 李宗民, 张玉鹏, 刘玉杰, 李华.
基于可变形图卷积的点云表征学习
Deformable Graph Convolutional Networks Based Point Cloud Representation Learning
计算机科学, 2022, 49(8): 273-278. https://doi.org/10.11896/jsjkx.210900023
[7] 姜梦函, 李邵梅, 郑洪浩, 张建朋.
基于改进位置编码的谣言检测模型
Rumor Detection Model Based on Improved Position Embedding
计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046
[8] 汪鸣, 彭舰, 黄飞虎.
基于多时间尺度时空图网络的交通流量预测模型
Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction
计算机科学, 2022, 49(8): 40-48. https://doi.org/10.11896/jsjkx.220100188
[9] 陈泳全, 姜瑛.
基于卷积神经网络的APP用户行为分析方法
Analysis Method of APP User Behavior Based on Convolutional Neural Network
计算机科学, 2022, 49(8): 78-85. https://doi.org/10.11896/jsjkx.210700121
[10] 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥.
基于注意力机制的医学影像深度哈希检索算法
Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism
计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153
[11] 孙奇, 吉根林, 张杰.
基于非局部注意力生成对抗网络的视频异常事件检测方法
Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection
计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[12] 檀莹莹, 王俊丽, 张超波.
基于图卷积神经网络的文本分类方法研究综述
Review of Text Classification Methods Based on Graph Convolutional Network
计算机科学, 2022, 49(8): 205-216. https://doi.org/10.11896/jsjkx.210800064
[13] 闫佳丹, 贾彩燕.
基于双图神经网络信息融合的文本分类方法
Text Classification Method Based on Information Fusion of Dual-graph Neural Network
计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042
[14] 张颖涛, 张杰, 张睿, 张文强.
全局信息引导的真实图像风格迁移
Photorealistic Style Transfer Guided by Global Information
计算机科学, 2022, 49(7): 100-105. https://doi.org/10.11896/jsjkx.210600036
[15] 曾志贤, 曹建军, 翁年凤, 蒋国权, 徐滨.
基于注意力机制的细粒度语义关联视频-文本跨模态实体分辨
Fine-grained Semantic Association Video-Text Cross-modal Entity Resolution Based on Attention Mechanism
计算机科学, 2022, 49(7): 106-112. https://doi.org/10.11896/jsjkx.210500224
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!