计算机科学 ›› 2023, Vol. 50 ›› Issue (5): 248-254.doi: 10.11896/jsjkx.220400069

• 人工智能 • 上一篇    下一篇


汪林, 蒙祖强, 杨丽娜   

  1. 广西大学计算机与电子信息学院 南宁 530004
  • 收稿日期:2022-04-07 修回日期:2022-09-22 出版日期:2023-05-15 发布日期:2023-05-06
  • 通讯作者: 蒙祖强(zqmeng@126.com)
  • 作者简介:(wanglingxun2021@163.com)
  • 基金资助:

Chinese Sentiment Analysis Based on CNN-BiLSTM Model of Multi-level and Multi-scale Feature Extraction

WANG Lin, MENG Zuqiang, YANG Lina   

  1. School of Computer and Electronic Information,Guangxi University,Nanning 530004,China
  • Received:2022-04-07 Revised:2022-09-22 Online:2023-05-15 Published:2023-05-06
  • About author:WANG Lin,born in 1996,postgraduate,is a member of China Computer Federation.His main research interests include natural language processing and machine learning.
    MENG Zuqiang,born in 1974,Ph.D,professor,is a senior member of China Computer Federation.His main research interests include artificial intelligence,multimodal learning and granular computing.
  • Supported by:
    National Natural Science Foundation of China(62266004,61862005).

摘要: 情感分析作为自然语言处理(NLP)的一个研究子领域,在舆情监测方面起着非常重要的作用。在中文情感分析任务中,已有方法仅从单极、单尺度来考虑情感特征,无法充分挖掘和利用情感特征信息,模型性能不理想。针对这一问题,提出了一种多级多尺度特征提取的CNN-BiLSTM模型。该模型首先利用预训练好的中文词向量模型并结合嵌入层微调来获取词级特征;然后利用多尺度短语级特征表征模块和句子级特征表征模块来分别获取短语级和句子级特征,在多尺度短语级特征表征模块中,使用具有不同卷积核尺寸的卷积网络来获取不同尺度的短语级特征;最后使用多级特征融合方法将词级特征、不同尺度的短语级特征以及句子级特征进行融合形成多级联合特征,与单极、单尺度特征相比,多级联合特征具有更多的情感信息。在实验中,使用Accuracy,Precision,Recall,F1这4个评估指标对模型性能进行评估,并与包括支持向量机(SVM)在内的8种方法进行比较。实验结果表明,所提方法在4个评估指标中的得分均优于8种对比方法,证明了所提模型在多级和多尺度特征提取上的优势。

关键词: 自然语言处理, 中文情感分析, 多级多尺度特征, 卷积神经网络, 双向长短期记忆网络

Abstract: Sentiment analysis,as a sub-field of natural language processing(NLP),plays a very important role in public opinion monitoring.In the Chinese sentiment analysis task,the existing methods only consider sentiment features from single-level and single-scale,which cannot fully mine and utilize the sentiment feature information,and the performance of the model is not ideal.To solve this problem,a CNN-BiLSTM model with multi-level and multi-scale feature extraction is proposed.This model first uses a pre-trained Chinese word vector model combined with embedding layer fine-tuning to obtain word-level features.Then,phrase-level and sentence-level features are obtained by multi-scale phrase-level feature representation module and sentence-level feature representation module respectively.In the multi-scale phrase-level feature representation module,convolutional networks with different convolution kernel sizes are used to obtain phrase-level features of different scales.Finally,a multi-level feature fusion method is used to fuse word-level features,phrase-level features of different scales,and sentence-level features to form multi-level joint features.Compared with single-level and single-scale features,multi-level joint features have more sentiment information.In the experiment,four evaluation indicators(Accuracy,Precision,Recall,F1) are used to evaluate the performance of the model and compared with eight methods including support vector machines(SVM).Experimental results show that the proposed method outperforms the eight comparison methods in the four evaluation indicators,which proves the advantages of the proposed model in multi-level and multi-scale feature extraction.

Key words: Natural language processing, Chinese sentiment analysis, Multi-level and multi-scale feature, Convolutional neural network, Bidirectional long short-term memory network


  • TP391
