计算机科学 ›› 2021, Vol. 48 ›› Issue (6A): 349-356.doi: 10.11896/jsjkx.200800004

• 智能计算 • 上一篇    下一篇

基于Transformer和多通道卷积神经网络的情感分析研究

霍帅1,2, 庞春江1   

  1. 1 华北电力大学(保定) 河北 保定071003
    2 云南电网有限公司电力科学研究院研究生工作站 昆明650217
  • 出版日期:2021-06-10 发布日期:2021-06-17
  • 通讯作者: 庞春江(972158083@qq.com)
  • 作者简介:hdhuoshuai@163.com
  • 基金资助:
    云南科技项目(YNKJXM20180019,YNKJXM20191572)

Research on Sentiment Analysis Based on Transformer and Multi-channel Convolutional Neural Network

HUO Shuai1,2, PANG Chun-jiang1   

  1. 1 North China Electric Power University(Baoding),Baoding,Hebei 071003,China
    2 Graduate Workstation of Yunnan Power Grid Co.,Ltd.Electric Power Research Institute,Kunming 650217,China
  • Online:2021-06-10 Published:2021-06-17
  • About author:HUO Shuai,born in 1994,postgraduate.His main research interests include emotion analysis and deep learning.
    PANG Chun-jiang,born in 1965,asso-ciate professor.His main research inte-rests include artificial intelligence and internet of things.
  • Supported by:
    Yunnan Science and Technology Project(YNKJXM20180019,YNKJXM20191572).

摘要: 文本情感分析是自然语言处理的经典领域之一。文中提出了一种基于transformer特征抽取器联合多通道卷积神经网络的文本情感分析的模型。该模型使用transformer特征提取器在传统Word2vector,Glove等方式训练的静态词向量的基础上来进行单词的分层、动态表示,针对特定数据集采用Fine-Tuning方式来进行训练有效提升了词向量的表征能力。多通道卷积神经网络考虑了不同大小范围内词序列之间的依赖关系,有效进行特征抽取并达到降维的目的,能够有效捕捉句子的上下文语义信息,使模型捕获更多的语义情感信息,提升文本的语义表达能力,通过Softmax激活函数达成情感倾向分类的目标。模型分别在IMDb和SST-2电影评论数据集上进行实验,测试集上准确率达90.4%和90.2%,这明所提模型较传统词嵌入结合CNN或RNN的模型在分类精确度上有了一定程度的提升。

关键词: Transformer, 多通道卷积神经网络, 情感分类, 特征提取器

Abstract: Text sentiment analysis is one of the classic fields of natural language processing.This paper proposes a text sentiment analysis model based on transformer feature extractor combined with multi-channel convolutional neural network.The model uses trsnsformer feature extractor to layer words and dynamically represent them on the basis of static word vectors trained by traditional Word2vector,Glove,etc.,and use Fine-Tuning for specific data sets for training,which effectively improves the representation of word vectors ability.The multi-channel convolutional neural network considers the dependence between word sequences in different size ranges,effectively extracts features and achieves the purpose of dimensionality reduction,can effectively capture the contextual semantic information of sentences,and enable the model to capture more semantic emotional information,improve the semantic expression ability of the text,and achieve the goal of emotional tendency classification through the Softmax activation function.The model is tested on the IMDb and SST-2 movie review datasets,and the accuracy rates on the test set reached 90.4% and 90.2%,indicating that the model proposed in this paper has better classification accuracy than the traditional word embedding combined with CNN or RNN.

Key words: Feature extractor, Multi-channel convolutional neural network, Sentiment classification, Transformer

中图分类号: 

  • TP391.1
[1] HU M Q,LIU B.Mining and Summarizing Customer Reviews[C]//Proc of the 10th ACM SIGKDD Int Conf on Knowledge Discovery and Data Mining.New York:ACM,2004:168-177.
[2] PANG B,LEE L,VAITHYANATHAN S.Thumbs up? Sentiment Classification using Machine Learning Techniques[C]//Proc of Empirical methods in Natural Language Processing.Cambridge,MA:MIT Press,2002:79-86.
[3] SALEH M R,MART N-VAILDIVIA M T,MONTEJO-R E,et al.Experiments with SVM to classify opinions in different domains[J].Expert Systems with Applications,2011,38(12):14799-14804.
[4] KIM Y.Convolutional neural networks for sentence classification[C]//Proceedings of 2014 Conference on Empinical Me-thods in Natural Language Processing.Daha,Qatar,2014:1746-1751.
[5] ZHU X D,SOBIHANI P.Long short-term menory over recursive structures [C]//Proc.of Int.Conf.on Machine Learning.New York:ACM,2015:1604-1612.
[6] YUAN H J,ZHANG X,NIU W H,et al.Research on text sen-timent analysis of multi-channel convolution and two-way GRU model with attention mechanism[J].Journal of Chinese Information Processing,2019,33(10):109-118.
[7] ZHAO Y O,ZHANG J Z,LI Y B,et al.Sentiment analysis combining word embedding based on language model and multi-scale convolutional neural network[J].Journal of Computer Applications,2020,40(3):651-657.
[8] MIKOLOV T,SUTSKEVER I,CHEN K,et al.Distributed representations of words and phrases and their compositionality[C]//NeurIPS.2013.
[9] PENNINGTON J,SOCHER R,MANNING C D.GloVe:Global vectors for word representation[C]//EMNLP.2014.
[10] PETERS M E,NEUMAN N,et al.Deep contextualized wordrepresentations [C]//Proceedings of North American Chapter of the Association for Computational Linguistics:ACL,2018:1-9.
[11] VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need [C]//Proceedingsof Annual Conference on Neural Information Processing Systems.Long Beach,USA,2017:1-5.
[12] YOSINSKI J,CLUNE J,BENGIO Y,et al.How transferable are features in deep neural networks?[C]//Advances in Neural Information Processing Systems.2017:6000-6010.
[13] QIAN Q,TIAN B,HUANG M,et al.Learning tag embeddings and tag-specific composition functions in recursive neural network[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing.Stroudsburg:Association for Computational Linguistics,2015:1365-1374.
[14] TAI K S,SOCHER R,MANNING C D.Improved semantic representations from tree-structured long short term memory networks [C]//Proceedings of Annual Meeting of the Association for Computational Linguistics.ACL,2015:1556-1566.
[15] RADFORD A,NARASIMHAN K,SALIMANS T,et al.Improving language understanding by generative pre-training[OL].https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf.
[16] DEVLIN J,CHANG M W,LEE K,et al.BERT:pre-training of deep bidirectional transformers for language understanding[C]//NAACL-HLT.2019.
[17] ZOPH B,GHIASI G,et al.2020.Rethinking Pre-training and Self-training[OL].https://arxiv.org/abs/2006.06882.
[18] ZHANG Y,WALLACE B.A Sensitivity Analysis of (and Practitioners' Guide to) Convolutional Neural Networks for Sentence Classification[OL].https://arxiv.org/abs/1510.03820.
[1] 汪鸣, 彭舰, 黄飞虎.
基于多时间尺度时空图网络的交通流量预测模型
Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction
计算机科学, 2022, 49(8): 40-48. https://doi.org/10.11896/jsjkx.220100188
[2] 张嘉淏, 刘峰, 齐佳音.
一种基于Bottleneck Transformer的轻量级微表情识别架构
Lightweight Micro-expression Recognition Architecture Based on Bottleneck Transformer
计算机科学, 2022, 49(6A): 370-377. https://doi.org/10.11896/jsjkx.210500023
[3] 康雁, 徐玉龙, 寇勇奇, 谢思宇, 杨学昆, 李浩.
基于Transformer和LSTM的药物相互作用预测
Drug-Drug Interaction Prediction Based on Transformer and LSTM
计算机科学, 2022, 49(6A): 17-21. https://doi.org/10.11896/jsjkx.210400150
[4] 林夕, 陈孜卓, 王中卿.
基于不平衡数据与集成学习的属性级情感分类
Aspect-level Sentiment Classification Based on Imbalanced Data and Ensemble Learning
计算机科学, 2022, 49(6A): 144-149. https://doi.org/10.11896/jsjkx.210500205
[5] 赵小虎, 叶圣, 李晓.
多算法融合的骨骼重建信息动作分类方法
Multi-algorithm Fusion Behavior Classification Method for Body Bone Information Reconstruction
计算机科学, 2022, 49(6): 269-275. https://doi.org/10.11896/jsjkx.210500070
[6] 陆亮, 孔芳.
面向对话的融入知识的实体关系抽取
Dialogue-based Entity Relation Extraction with Knowledge
计算机科学, 2022, 49(5): 200-205. https://doi.org/10.11896/jsjkx.210300198
[7] 李浩, 张兰, 杨兵, 杨海潇, 寇勇奇, 王飞, 康雁.
融合双重权重机制和图卷积神经网络的微博细粒度情感分类
Fine-grained Sentiment Classification of Chinese Microblogs Combining Dual Weight Mechanismand Graph Convolutional Neural Network
计算机科学, 2022, 49(3): 246-254. https://doi.org/10.11896/jsjkx.201200073
[8] 潘志豪, 曾碧, 廖文雄, 魏鹏飞, 文松.
基于交互注意力图卷积网络的方面情感分类
Interactive Attention Graph Convolutional Networks for Aspect-based Sentiment Classification
计算机科学, 2022, 49(3): 294-300. https://doi.org/10.11896/jsjkx.210100180
[9] 杨慧敏, 马廷淮.
融合检索与生成的复合对话模型
Compound Conversation Model Combining Retrieval and Generation
计算机科学, 2021, 48(8): 234-239. https://doi.org/10.11896/jsjkx.200700162
[10] 杨进才, 曹元, 胡泉, 沈显君.
基于Transformer模型与关系词特征的汉语因果类复句关系自动识别
Relation Classification of Chinese Causal Compound Sentences Based on Transformer Model and Relational Word Feature
计算机科学, 2021, 48(6A): 295-298. https://doi.org/10.11896/jsjkx.200500019
[11] 陈千, 车苗苗, 郭鑫, 王素格.
一种循环卷积注意力模型的文本情感分类方法
Recurrent Convolution Attention Model for Sentiment Classification
计算机科学, 2021, 48(2): 245-249. https://doi.org/10.11896/jsjkx.200100078
[12] 蒋琪, 苏伟, 谢莹, 周弘安平, 张久文, 蔡川.
基于Transformer的汉字到盲文端到端自动转换
End-to-End Chinese-Braille Automatic Conversion Based on Transformer
计算机科学, 2021, 48(11A): 136-141. https://doi.org/10.11896/jsjkx.210100025
[13] 王友卫, 朱晨, 朱建明, 李洋, 凤丽洲, 刘江淳.
基于用户兴趣词典和LSTM的个性化情感分类方法
User Interest Dictionary and LSTM Based Method for Personalized Emotion Classification
计算机科学, 2021, 48(11A): 251-257. https://doi.org/10.11896/jsjkx.201200202
[14] 余珊珊, 苏锦钿, 李鹏飞.
一种基于自注意力的句子情感分类方法
Sentiment Classification Method for Sentences via Self-attention
计算机科学, 2020, 47(4): 204-210. https://doi.org/10.11896/jsjkx.190100097
[15] 霍丹, 张生杰, 万路军.
基于上下文的情感词向量混合模型
Context-based Emotional Word Vector Hybrid Model
计算机科学, 2020, 47(11A): 28-34. https://doi.org/10.11896/jsjkx.191100114
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!