计算机科学 ›› 2023, Vol. 50 ›› Issue (3): 307-314.doi: 10.11896/jsjkx.211200189

• 人工智能 • 上一篇    下一篇

结合上下文和依存句法信息的中文短文本情感分析

杜启明1,2, 李男1,2, 刘文甫1,2,3, 杨舒丹1,2, 岳峰1,2   

  1. 1 信息工程大学网络空间安全学院 郑州 450000
    2 信息工程大学数学工程与先进计算国家重点实验室 郑州 450000
    3 电子信息系统复杂电磁环境效应重点实验室 河南 洛阳 471003
  • 收稿日期:2021-12-16 修回日期:2022-04-21 出版日期:2023-03-15 发布日期:2023-03-15
  • 通讯作者: 李男(linan_happy@126.com)
  • 作者简介:(qimingducst@163.com)
  • 基金资助:
    国家自然科学基金(61802433)

Sentiment Analysis of Chinese Short Text Combining Context and Dependent Syntactic Information

DU Qiming1,2, LI Nan1,2, LIU Wenfu1,2,3, YANG Shudan1,2, YUE Feng1,2   

  1. 1 School of Cyberspace Security Academy,Information Engineering University,Zhengzhou 450000,China
    2 State Key Laboratory of Mathematical Engineering and Advanced Computing,Information Engineering University,Zhengzhou 450000,China
    3 State Key Laboratory of Complex Electromagnetic Environment Effect on Electronic and Information System,Luoyang,Henan 471003,China
  • Received:2021-12-16 Revised:2022-04-21 Online:2023-03-15 Published:2023-03-15
  • About author:DU Qiming,born in 1998,postgraduate.His main research interests include big data analysis,natural language proces-sing and so on.
    LI Nan,born in 1977,Ph.D,associate professor.His main research interests include high-performance computing,big data analysis,big data security and so on.
  • Supported by:
    National Natural Science Foundation of China(61802433).

摘要: 依存句法分析旨在从语言学的角度分析句子的句法结构。现有的研究表明,将这种类似于图结构的数据与图卷积神经网络(Graph Convolutional Network,GCN)进行结合,有助于模型更好地理解文本语义。然而,这些工作在将依存句法信息处理为邻接矩阵时,均忽略了句法依赖标签类型,同时也未考虑与依赖标签相关的单词语义,导致模型无法捕捉到文本中的深层情感特征。针对以上问题,提出了一种结合上下文和依存句法信息的中文短文本情感分析模型(Context and Dependency Syntactic Information,CDSI)。该模型不仅利用双向长短期记忆网络(Bidirectional Long Short-Term Memory,BiLSTM)提取文本的上下文语义,而且引入了一种基于依存关系感知的嵌入表示方法,以针对句法结构挖掘不同依赖路径对情感分类任务的贡献权重,然后利用GCN针对上下文和依存句法信息同时建模,以加强文本表示中的情感特征。基于SWB,NLPCC2014和SMP2020-EWEC数据集进行验证,实验表明CDSI模型能够有效融合语句中的语义以及句法结构信息,在中文短文本情感二分类以及多分类中均取得了较好的效果。

关键词: 句法结构, 上下文信息, GCN, 中文短文本

Abstract: Dependency parsing aims to analyze the syntactic structure of sentences from the perspective of linguistics.Existing studies suggest that combining such graph-like data with graph convolutional network(GCN) can help model better understand the text semantics.However,when dealing with dependency syntactic information as adjacency matrix,these methods ignore the types of syntactic dependency tags and the word semantics related to the tags,which makes the model unable to capture the deep emotional features.To solve the preceding problem,this paper proposes a Chinese short text sentiment analysis model CDSI(context and dependency syntactic information).This model can use BiLSTM(bidirectional long short-term memory) network to extract the context semantics of the text.Moreover,a dependency-aware embedding representation method is introduced to mine the contribution weights of different dependent paths to the sentiment classification task based on the syntactic structure.Then the GCN is used to model the context and dependent syntactic information at the same time,so as to strengthen the emotional features in the text representation.Based on SWB,NLPCC2014 and SMP2020-EWEC datasets,experimental results show that CDSI can effectively integrate the semantic and structural information in sentences,which achieves good results in both the Chinese short text sentiment binary classification and multi-classification tasks.

Key words: Syntactic structure, Context information, GCN, Chinese short text

中图分类号: 

  • TP391.1
[1]ZHANG Y,XU H,XU K.Chinese Short Text Classificationbased on Dependency Syntax Information[C]//ICCDA 2021:The 5th International Conference on Compute and Data Analysis.Sanya:ACM,2021:133-138.
[2]LI C B,DUAN Q J,JI C H,et al.Method of Short Text Classification Based on CHI and TF-IWF Feature Selection [J].Journal of Chongqing University of Technology(Natural Science),2021,35(5):135-140,222.
[3]QIU X,SUN T,XU Y,et al.Pre-trained Models for Natural Language Processing:Asurvey[J].Science China Technological Sciences,2020,63(10):1-26.
[4]KIM Y.Convolutional Neural Networks for Sentence Classification[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Proceeding.Doha:ACL,2014:1746-1751.
[5]LENG X L,MIAO X A,LIU T.Using Recurrent Neural Network Structure with Enhanced Multi-Head Self-Attention for Sentiment Analysis[J].Multimedia Tools and Applications,2021,80(8):12581-12600.
[6]XU G,MENG Y,QIU X,et al.Sentiment Analysis of Comment Texts based on BiLSTM[J].IEEE Access,2019,7:51522-51532.
[7]XIAO H,XU S H.Analysis on Web Public Opinion Orientation based on Syntactic Parsing and Emotional Dictionary[J].Small Microcomputer System,2014,35(4):811-813.
[8]LI X H,GUO H,YAN H T.Micro-blog Sentiment Analysisbased on Improved DependencyParsing[J].Computer and Digi-tal Engineering,2017,45(3):506-511.
[9]WANG C,WANG B,XIANG W,et al.Encoding Syntactic Dependency and Topical Information for Social Emotion Classification[C]//Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval.Paris:ACM,2019:881-884.
[10]TANG H,JI D,LI C,et al.Dependency Graph Enhanced Dual-Transformer Structure for Aspect-based Sentiment Classification[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.Online:Association for Computational Linguistics,2020:6578-6588.
[11]ZHANG M,LI Z,FU G,et al.Dependency-based Syntax-Aware Word Representations[J].Artificial Intelligence,2021,292(4):103427.
[12]GUO Z,ZHANG Y,LU W.Attention Guided Graph Convolutional Networks for Relation Extraction[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.Florence:Association for Computational Linguistics,2019:241-251.
[13]ZHANG B,ZHANG Y,WANG R,et al.Syntax-Aware Opinion Role Labeling with Dependency GraphConvolutional Networks[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.Online:Association for Computational Linguistics,2020:3249-3258.
[14]WANG J H,WANG H H,WANG L.Dependency Parsing of Financial News to Improve Sentiment Analysis for Predicting Market Prices[C]//International Conference on Technologies and Applications of Artificial Intelligence.Taipei:IEEE,2020:1-7.
[15]ZHANG X S,GUO R Q,HUANG D G.Named Entity Recognition Based on Dependency[J].Journal of Chinese Information Processing,2021,35(6):63-73.
[16]KIPF T N,WELLING M.Semi-Supervised Classification withGraph Convolutional Networks[J].arXiv:1609.02907,2016.
[17]SUN K,ZHANG R C,MENSAH S,et al.Aspect-Level Sentiment Analysis Via Convolution over Dependency Tree[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing.Hong Kong:Association for Computational Linguistics,2019:5679-5688.
[18]LAI Y,ZHANG L,HAN D,et al.Fine-Grained Emotion Classification of ChineseMicroblogs based on Graph Convolution Networks[J].World Wide Web,2020,23(5):2771-2787.
[19]FAN T,WANG H,WU P.Sentiment Analysis of Online Users’ Negative Emotions based on GraphConvolutional Neural Network and Dependency Parsing [J].Data Analysis and Know-ledge Discovery,2021,5(9):97-106.
[20]PARK J,PARK C,KIM J,et al.ADC:Advanced DocumentClustering Using Contextualized Representations[J].Expert Systems with Applications,2019,137:157-166.
[21]CHE W,LI Z,LIU T.LTP:A Chinese Language Technology Platform[C]//COLING 2010,23rd International Conference on Computational Linguistics.Beijing:Demonstrations Volume,2010:13-16.
[22]MARCHEGGIANI D,TITOV I.Encoding Sentences withGraph Convolutional Networks for Semantic Role Labeling[C]//Proceedings of the 2017 Conference on Empirical Me-thods in Natural Language Processing.Copenhagen:Association for Computational Linguistics,2017:1506-1515.
[23]KARAMI M,MOSALLANEZHAD A,MANCENIDO M V,et al.“Let's Eat Grandma”:When Punctuation Matters in Sentence Representation for SentimentAnalysis[J].arXiv:2101.03029,2020.
[24]MIKOLOV T,SUTSKEVER I,CHEN K,et al.DistributedRepresentations of Words and Phrases and Their Compositio-nality[J].Advances in Neural Information Proces-sing Systems,2013,26(5):3111-3119.
[25]LI Y,DONG H B.Text Sentiment Analysis based on Feature Fusion of Convolution Neural Network and Bidirectional Long Short-Term Memory Network[J].Computer Applications,2018,38(11):3075-3080.
[1] 黄少滨, 孙雪薇, 李熔盛.
基于跨句上下文信息的神经网络关系分类方法
Relation Classification Method Based on Cross-sentence Contextual Information for Neural Network
计算机科学, 2022, 49(6A): 119-124. https://doi.org/10.11896/jsjkx.210600150
[2] 康雁, 吴志伟, 寇勇奇, 张兰, 谢思宇, 李浩.
融合Bert和图卷积的深度集成学习软件需求分类
Deep Integrated Learning Software Requirement Classification Fusing Bert and Graph Convolution
计算机科学, 2022, 49(6A): 150-158. https://doi.org/10.11896/jsjkx.210500065
[3] 邵欣欣.
TI-FastText自动商品分类算法
TI-FastText Automatic Goods Classification Algorithm
计算机科学, 2022, 49(6A): 206-210. https://doi.org/10.11896/jsjkx.210500089
[4] 刘硕, 王庚润, 彭建华, 李柯.
基于混合字词特征的中文短文本分类算法
Chinese Short Text Classification Algorithm Based on Hybrid Features of Characters and Words
计算机科学, 2022, 49(4): 282-287. https://doi.org/10.11896/jsjkx.210200027
[5] 缪峰, 王萍, 李太勇.
基于事件动作方向的隐式因果关系抽取方法
Implicit Causality Extraction Method Based on Event Action Direction
计算机科学, 2022, 49(3): 276-280. https://doi.org/10.11896/jsjkx.211100249
[6] 潘毅, 王丽萍.
基于改进拆分注意力网络的目标检测算法
Object Detection Algorithm Based on Improved Split-attention Network
计算机科学, 2022, 49(10): 198-206. https://doi.org/10.11896/jsjkx.210800214
[7] 郝志峰, 廖祥财, 温雯, 蔡瑞初.
基于多上下文信息的协同过滤推荐算法
Collaborative Filtering Recommendation Algorithm Based on Multi-context Information
计算机科学, 2021, 48(3): 168-173. https://doi.org/10.11896/jsjkx.200700101
[8] 晏旭, 马帅, 曾凤娇, 郭正华, 伍俊龙, 杨平, 许冰.
基于编码-解码器架构的光场深度估计方法
Light Field Depth Estimation Method Based on Encoder-decoder Architecture
计算机科学, 2021, 48(10): 212-219. https://doi.org/10.11896/jsjkx.200900005
[9] 马海江.
基于卷积神经网络与约束概率矩阵分解的推荐算法
Recommendation Algorithm Based on Convolutional Neural Network and Constrained Probability Matrix Factorization
计算机科学, 2020, 47(6A): 540-545. https://doi.org/10.11896/JsJkx.191000172
[10] 倪海清, 刘丹, 史梦雨.
基于语义感知的中文短文本摘要生成模型
Chinese Short Text Summarization Generation Model Based on Semantic-aware
计算机科学, 2020, 47(6): 74-78. https://doi.org/10.11896/jsjkx.190600006
[11] 杨少鹏, 刘宏哲, 王雪峤.
基于特征图融合的小尺寸人脸检测
Small Size Face Detection Based on Feature Map Fusion
计算机科学, 2020, 47(6): 126-132. https://doi.org/10.11896/jsjkx.19050002
[12] 周鹏程,龚声蓉,钟珊,包宗铭,戴兴华.
基于深度特征融合的图像语义分割
Image Semantic Segmentation Based on Deep Feature Fusion
计算机科学, 2020, 47(2): 126-134. https://doi.org/10.11896/jsjkx.190100119
[13] 徐扬,王建成,刘启元,李寿山.
基于上下文信息的口语意图检测方法
Intention Detection in Spoken Language Based on Context Information
计算机科学, 2020, 47(1): 205-211. https://doi.org/10.11896/jsjkx.181202269
[14] 赵鹏, 吴礼发, 洪征.
基于经纪人的多云访问控制模型研究
Research on Broker Based Multicloud Access Control Model
计算机科学, 2019, 46(11): 123-129. https://doi.org/10.11896/jsjkx.190300112
[15] 文俊浩,孙光辉,李顺.
基于用户聚类和移动上下文的矩阵分解推荐算法研究
Study on Matrix Factorization Recommendation Algorithm Based on User Clustering and Mobile Context
计算机科学, 2018, 45(4): 215-219. https://doi.org/10.11896/j.issn.1002-137X.2018.04.036
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!