用于文本分类的CNN_BiLSTM_Attention混合模型

doi:10.11896/jsjkx.200400116

摘要/Abstract

摘要： 文本分类是许多自然语言处理任务的基础。卷积神经网络可以提取文本的短语级特征,但是不能很好地捕获文本的结构信息;循环神经网络可以提取文本的全局结构信息,但是对关键模式信息捕获能力不足;而注意力机制能够学习到不同词或短语对文本整体语义的分布,关键的词或短语会被分配较高的权重,但是同样对全局结构信息不敏感。另外,现有模型大多只考虑词级信息,而忽略了短语级信息。针对上述模型中存在的问题,文中提出一种融合CNN,RNN,Attention的混合模型,该模型同时考虑不同层次的关键模式信息和全局结构信息,并把它们融合起来得到最终的文本表示,最后把文本表示输入softmax层进行分类。在多个文本分类数据集上进行了实验,实验结果表明该模型相较于现有模型可以实现更高的准确率。此外,还通过实验分析了模型的不同组件对模型性能的影响。

关键词: 关键模式信息, 混合模型, 全局结构信息, 文本表示, 文本分类

Abstract: Text classification is the basis of many natural language processing tasks.Convolutional neural network (CNN) can be used to extract the phrase level features of text,but it can't capture the structure information of text well;Recurrent neural network (RNN) can extract the global structure information of text,but its ability to capture the key pattern information is insufficient.Attention mechanism can learn the distribution of different words or phrases to the overall semantics of text,key words or phrases will be assigned higher weights,but it is not sensitive to global structure information.In addition,most of the existing models only consider word level information,but ignore phrase level information.In view of the problems in the above models,this paper proposes a hybrid model which integrates CNN,RNN and attention.The model considers the key pattern information and global structure information of different levels at the same time,and fuses them to get the final text representation.Finally,the text representation is input to the softmax layer for classification.Experiments on multiple text classification datasets show that the model can achieve higher accuracy than the existing models.In addition,the effects of different components on the performance of the model are analyzed through experiments.

Key words: Global structure information, Hybrid model, Key pattern information, Text classification, Text representation

中图分类号:

TP391.1

吴汉瑜, 严江, 黄少滨, 李熔盛, 姜梦奇. 用于文本分类的CNN_BiLSTM_Attention混合模型[J]. 计算机科学, 2020, 47(11A): 24-27. https://doi.org/10.11896/jsjkx.200400116

WU Han-yu, YAN Jiang, HUANG Shao-bin, LI Rong-sheng, JIANG Meng-qi. CNN_BiLSTM_Attention Hybrid Model for Text Classification[J]. Computer Science, 2020, 47(11A): 24-27. https://doi.org/10.11896/jsjkx.200400116

参考文献

[1] JOACHIMS T.Text categorization with support vector ma-chines:Learning with many relevant features[C]//European Conference on Machine Learning.Berlin:Springer,1998:137-142.
[2] CHEN Z,SHI G,WANG X.Text classification based on Naive Bayes algorithm with feature selection[J].International Information Institute (Tokyo).Information,2012,15(10):4255.
[3] KIM Y.Convolutional neural networks for sentence classification[J].arXiv:1408.5882,2014.
[4] KALCHBRENNER N,GREFENSTETTE E,BLUNSOM P.A convolutional neural network for modelling sentences[J].arXiv:1404.2188,2014.
[5] ZHANG X,ZHAO J,LECUN Y.Character-level convolutional networks for text classification[C]//Advances in Neural Information Processing Systems.2015:649-657.
[6] CONNEAU A,SCHWENK H,BARRAULT L,et al.Very deep convolutional networks for text classification[J].arXiv:1606.01781,2016.
[7] TANG D,QIN B,LIU T.Document modeling with gated recurrent neural network for sentiment classification[C]//Procee-dings of the 2015 Conference on Empirical Methods in Natural Language Processing.2015:1422-1432.
[8] WANG B.Disconnected recurrent neural networks for text categorization[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1:Long Papers).2018:2311-2320.
[9] YANG Z,YANG D,DYER C,et al.Hierarchical attention networks for document classification[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2016:1480-1489.
[10] LIN Z,FENG M,SANTOS C N,et al.A structured self-attentive sentence embedding[J].arXiv:1703.03130,2017.
[11] LIU Y,ZHAI D H,REN Q N.News Text Classification Based on CNLSTM Model with Attention Mechanism[J].Computer Engineering,2019,45(7):303-308,314.
[12] GU J H,PENG W T,LI N N,et al.Sentiment classificationmethod based on convolution attention mechanism[J].ComputerEmgineering and Design,2020,41(1):95-99.
[13] PENNINGTON J,SOCHER R,MANNING C.Glove:Globalvectors for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Proces-sing (EMNLP).2014:1532-1543.
[14] PANG B,LEE L.Seeing stars:Exploiting class relationships for sentiment categorization with respect to rating scales[C]//Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics.Association for Computational Linguistics,2005:115-124.
[15] PANG B,LEE L.A sentimental education:Sentiment analysisusing subjectivity summarization based on minimum cuts[C]//Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics.Association for Computational Linguistics,2004:271.
[16] LI X,ROTH D.Learning question classifiers[C]//Proceedings of the 19th international conference on Computational linguistics-Volume 1.Association for Computational Linguistics,2002:1-7.
[17] HU M,LIU B.Mining and summarizing customer reviews[C]//Proceedings of the Tenth ACM SIGKDD International Confe-rence on Knowledge Discovery and Data Mining.ACM,2004:168-177.
[18] SOCHER R,PERELYGIN A,WU J,et al.Recursive deep mo-dels for semantic compositionality over a sentiment treebank[C]//Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing.2013:1631-1642.
[19] KINGMA D P,BA J.Adam:A method for stochastic optimization[J].arXiv:1412.6980,2014.
[20] YIN W,SCHÜTZE H.Multichannel variable-size convolutionfor sentence classification[J].arXiv:1603.04513,2016.
[21] CHO K,VAN MERRIËNBOER B,GULCEHRE C,et al.Learning phrase representations using RNN encoder-decoder for statistical machine translation[J].arXiv:1406.1078,2014.
[22] ZHOU C,SUN C,LIU Z,et al.A C-LSTM Neural Network for Text Classification [J].Computer Science,2015,1(4):39-44.
[23] WANG C,JIANG F,YANG H.A hybrid framework for text modeling with convolutional RNN[C]//Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Disco-very and Data Mining.2017:2061-2069.
[24] ZHOU P,QI Z,ZHENG S,et al.Text classification improved by integrating bidirectional LSTM with two-dimensional max pooling[J].arXiv:1611.06639,2016.
[25] ZHOU F,LI R Y.Convolutional Neural Network Model forText Classification Based on BGRU Pooling[J].Computer Science,2018,45(6):235-240.
[26] ZHENG C,XUE M Y,HONG T T,et al.DC-BiGRU_CNN Model for Short text Classification[J].Computer Science,2019,46(11):186-192.

相关文章 15

[1]	武红鑫, 韩萌, 陈志强, 张喜龙, 李慕航. 监督和半监督学习下的多标签分类综述 Survey of Multi-label Classification Based on Supervised and Semi-supervised Learning 计算机科学, 2022, 49(8): 12-25. https://doi.org/10.11896/jsjkx.210700111
[2]	郝志荣, 陈龙, 黄嘉成. 面向文本分类的类别区分式通用对抗攻击方法 Class Discriminative Universal Adversarial Attack for Text Classification 计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077
[3]	檀莹莹, 王俊丽, 张超波. 基于图卷积神经网络的文本分类方法研究综述 Review of Text Classification Methods Based on Graph Convolutional Network 计算机科学, 2022, 49(8): 205-216. https://doi.org/10.11896/jsjkx.210800064
[4]	闫佳丹, 贾彩燕. 基于双图神经网络信息融合的文本分类方法 Text Classification Method Based on Information Fusion of Dual-graph Neural Network 计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042
[5]	邵欣欣. TI-FastText自动商品分类算法 TI-FastText Automatic Goods Classification Algorithm 计算机科学, 2022, 49(6A): 206-210. https://doi.org/10.11896/jsjkx.210500089
[6]	邓凯, 杨频, 李益洲, 杨星, 曾凡瑞, 张振毓. 一种可快速迁移的领域知识图谱构建方法 Fast and Transmissible Domain Knowledge Graph Construction Method 计算机科学, 2022, 49(6A): 100-108. https://doi.org/10.11896/jsjkx.210900018
[7]	康雁, 吴志伟, 寇勇奇, 张兰, 谢思宇, 李浩. 融合Bert和图卷积的深度集成学习软件需求分类 Deep Integrated Learning Software Requirement Classification Fusing Bert and Graph Convolution 计算机科学, 2022, 49(6A): 150-158. https://doi.org/10.11896/jsjkx.210500065
[8]	邓朝阳, 仲国强, 王栋. 基于注意力门控图神经网络的文本分类 Text Classification Based on Attention Gated Graph Neural Network 计算机科学, 2022, 49(6): 326-334. https://doi.org/10.11896/jsjkx.210400218
[9]	刘硕, 王庚润, 彭建华, 李柯. 基于混合字词特征的中文短文本分类算法 Chinese Short Text Classification Algorithm Based on Hybrid Features of Characters and Words 计算机科学, 2022, 49(4): 282-287. https://doi.org/10.11896/jsjkx.210200027
[10]	钟桂凤, 庞雄文, 隋栋. 基于Word2Vec和改进注意力机制AlexNet-2的文本分类方法 Text Classification Method Based on Word2Vec and AlexNet-2 with Improved AttentionMechanism 计算机科学, 2022, 49(4): 288-293. https://doi.org/10.11896/jsjkx.211100016
[11]	武玉坤, 李伟, 倪敏雅, 许志骋. 单类支持向量机融合深度自编码器的异常检测模型 Anomaly Detection Model Based on One-class Support Vector Machine Fused Deep Auto-encoder 计算机科学, 2022, 49(3): 144-151. https://doi.org/10.11896/jsjkx.210100142
[12]	邓维斌, 朱坤, 李云波, 胡峰. FMNN:融合多神经网络的文本分类模型 FMNN:Text Classification Model Fused with Multiple Neural Networks 计算机科学, 2022, 49(3): 281-287. https://doi.org/10.11896/jsjkx.210200090
[13]	张虎, 柏萍. 融入句子中远距离词语依赖的图卷积短文本分类方法 Graph Convolutional Networks with Long-distance Words Dependency in Sentences for Short Text Classification 计算机科学, 2022, 49(2): 279-284. https://doi.org/10.11896/jsjkx.201200062
[14]	杜少华, 万怀宇, 武志昊, 林友芳. 融合文本序列和图信息的海关商品HS编码分类 Customs Commodity HS Code Classification Integrating Text Sequence and Graph Information 计算机科学, 2021, 48(4): 97-103. https://doi.org/10.11896/jsjkx.200900053
[15]	鲁博仁, 胡世哲, 娄铮铮, 叶阳东. 面向铁路文本分类的字符级特征提取方法 Character-level Feature Extraction Method for Railway Text Classification 计算机科学, 2021, 48(3): 220-226. https://doi.org/10.11896/jsjkx.200200061

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed