计算机科学 ›› 2020, Vol. 47 ›› Issue (11A): 24-27.doi: 10.11896/jsjkx.200400116

• 人工智能 • 上一篇    下一篇

用于文本分类的CNN_BiLSTM_Attention混合模型

吴汉瑜1,2, 严江2, 黄少滨1, 李熔盛1, 姜梦奇1   

  1. 1 哈尔滨工程大学计算机科学与技术学院 哈尔滨 150001
    2 中电科大数据研究院有限公司提升政府治理能力大数据应用技术国家工程实验室 贵阳 550000
  • 出版日期:2020-11-15 发布日期:2020-11-17
  • 通讯作者: 吴汉瑜(wuhanyu@hrbeu.edu.cn)
  • 基金资助:
    提升政府治理能力大数据应用技术国家工程实验室开放基金

CNN_BiLSTM_Attention Hybrid Model for Text Classification

WU Han-yu1,2, YAN Jiang2, HUANG Shao-bin1, LI Rong-sheng1, JIANG Meng-qi1   

  1. 1 College of Computer Science and Technology,Harbin Engineering University,Harbin 150001,China
    2 Big Data Application on Improving Government Governance Capabilities National Engineering Laboratory,CETC Big Data Research Institute Co.,Ltd.,Guiyang 550000,China
  • Online:2020-11-15 Published:2020-11-17
  • About author:WU Han-yu,born in 1996,M.S..His main research interests include natural language processing,deep learning,etc.
  • Supported by:
    This work was supported by the Big Data Application on Improving Government Governance Capabilities National Engineering Laboratory Open Fund Project.

摘要: 文本分类是许多自然语言处理任务的基础。卷积神经网络可以提取文本的短语级特征,但是不能很好地捕获文本的结构信息;循环神经网络可以提取文本的全局结构信息,但是对关键模式信息捕获能力不足;而注意力机制能够学习到不同词或短语对文本整体语义的分布,关键的词或短语会被分配较高的权重,但是同样对全局结构信息不敏感。另外,现有模型大多只考虑词级信息,而忽略了短语级信息。针对上述模型中存在的问题,文中提出一种融合CNN,RNN,Attention的混合模型,该模型同时考虑不同层次的关键模式信息和全局结构信息,并把它们融合起来得到最终的文本表示,最后把文本表示输入softmax层进行分类。在多个文本分类数据集上进行了实验,实验结果表明该模型相较于现有模型可以实现更高的准确率。此外,还通过实验分析了模型的不同组件对模型性能的影响。

关键词: 关键模式信息, 混合模型, 全局结构信息, 文本表示, 文本分类

Abstract: Text classification is the basis of many natural language processing tasks.Convolutional neural network (CNN) can be used to extract the phrase level features of text,but it can't capture the structure information of text well;Recurrent neural network (RNN) can extract the global structure information of text,but its ability to capture the key pattern information is insufficient.Attention mechanism can learn the distribution of different words or phrases to the overall semantics of text,key words or phrases will be assigned higher weights,but it is not sensitive to global structure information.In addition,most of the existing models only consider word level information,but ignore phrase level information.In view of the problems in the above models,this paper proposes a hybrid model which integrates CNN,RNN and attention.The model considers the key pattern information and global structure information of different levels at the same time,and fuses them to get the final text representation.Finally,the text representation is input to the softmax layer for classification.Experiments on multiple text classification datasets show that the model can achieve higher accuracy than the existing models.In addition,the effects of different components on the performance of the model are analyzed through experiments.

Key words: Global structure information, Hybrid model, Key pattern information, Text classification, Text representation

中图分类号: 

  • TP391.1
[1] JOACHIMS T.Text categorization with support vector ma-chines:Learning with many relevant features[C]//European Conference on Machine Learning.Berlin:Springer,1998:137-142.
[2] CHEN Z,SHI G,WANG X.Text classification based on Naive Bayes algorithm with feature selection[J].International Information Institute (Tokyo).Information,2012,15(10):4255.
[3] KIM Y.Convolutional neural networks for sentence classification[J].arXiv:1408.5882,2014.
[4] KALCHBRENNER N,GREFENSTETTE E,BLUNSOM P.A convolutional neural network for modelling sentences[J].arXiv:1404.2188,2014.
[5] ZHANG X,ZHAO J,LECUN Y.Character-level convolutional networks for text classification[C]//Advances in Neural Information Processing Systems.2015:649-657.
[6] CONNEAU A,SCHWENK H,BARRAULT L,et al.Very deep convolutional networks for text classification[J].arXiv:1606.01781,2016.
[7] TANG D,QIN B,LIU T.Document modeling with gated recurrent neural network for sentiment classification[C]//Procee-dings of the 2015 Conference on Empirical Methods in Natural Language Processing.2015:1422-1432.
[8] WANG B.Disconnected recurrent neural networks for text categorization[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1:Long Papers).2018:2311-2320.
[9] YANG Z,YANG D,DYER C,et al.Hierarchical attention networks for document classification[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2016:1480-1489.
[10] LIN Z,FENG M,SANTOS C N,et al.A structured self-attentive sentence embedding[J].arXiv:1703.03130,2017.
[11] LIU Y,ZHAI D H,REN Q N.News Text Classification Based on CNLSTM Model with Attention Mechanism[J].Computer Engineering,2019,45(7):303-308,314.
[12] GU J H,PENG W T,LI N N,et al.Sentiment classificationmethod based on convolution attention mechanism[J].ComputerEmgineering and Design,2020,41(1):95-99.
[13] PENNINGTON J,SOCHER R,MANNING C.Glove:Globalvectors for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Proces-sing (EMNLP).2014:1532-1543.
[14] PANG B,LEE L.Seeing stars:Exploiting class relationships for sentiment categorization with respect to rating scales[C]//Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics.Association for Computational Linguistics,2005:115-124.
[15] PANG B,LEE L.A sentimental education:Sentiment analysisusing subjectivity summarization based on minimum cuts[C]//Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics.Association for Computational Linguistics,2004:271.
[16] LI X,ROTH D.Learning question classifiers[C]//Proceedings of the 19th international conference on Computational linguistics-Volume 1.Association for Computational Linguistics,2002:1-7.
[17] HU M,LIU B.Mining and summarizing customer reviews[C]//Proceedings of the Tenth ACM SIGKDD International Confe-rence on Knowledge Discovery and Data Mining.ACM,2004:168-177.
[18] SOCHER R,PERELYGIN A,WU J,et al.Recursive deep mo-dels for semantic compositionality over a sentiment treebank[C]//Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing.2013:1631-1642.
[19] KINGMA D P,BA J.Adam:A method for stochastic optimization[J].arXiv:1412.6980,2014.
[20] YIN W,SCHÜTZE H.Multichannel variable-size convolutionfor sentence classification[J].arXiv:1603.04513,2016.
[21] CHO K,VAN MERRIËNBOER B,GULCEHRE C,et al.Learning phrase representations using RNN encoder-decoder for statistical machine translation[J].arXiv:1406.1078,2014.
[22] ZHOU C,SUN C,LIU Z,et al.A C-LSTM Neural Network for Text Classification [J].Computer Science,2015,1(4):39-44.
[23] WANG C,JIANG F,YANG H.A hybrid framework for text modeling with convolutional RNN[C]//Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Disco-very and Data Mining.2017:2061-2069.
[24] ZHOU P,QI Z,ZHENG S,et al.Text classification improved by integrating bidirectional LSTM with two-dimensional max pooling[J].arXiv:1611.06639,2016.
[25] ZHOU F,LI R Y.Convolutional Neural Network Model forText Classification Based on BGRU Pooling[J].Computer Science,2018,45(6):235-240.
[26] ZHENG C,XUE M Y,HONG T T,et al.DC-BiGRU_CNN Model for Short text Classification[J].Computer Science,2019,46(11):186-192.
[1] 武红鑫, 韩萌, 陈志强, 张喜龙, 李慕航.
监督和半监督学习下的多标签分类综述
Survey of Multi-label Classification Based on Supervised and Semi-supervised Learning
计算机科学, 2022, 49(8): 12-25. https://doi.org/10.11896/jsjkx.210700111
[2] 郝志荣, 陈龙, 黄嘉成.
面向文本分类的类别区分式通用对抗攻击方法
Class Discriminative Universal Adversarial Attack for Text Classification
计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077
[3] 檀莹莹, 王俊丽, 张超波.
基于图卷积神经网络的文本分类方法研究综述
Review of Text Classification Methods Based on Graph Convolutional Network
计算机科学, 2022, 49(8): 205-216. https://doi.org/10.11896/jsjkx.210800064
[4] 闫佳丹, 贾彩燕.
基于双图神经网络信息融合的文本分类方法
Text Classification Method Based on Information Fusion of Dual-graph Neural Network
计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042
[5] 邵欣欣.
TI-FastText自动商品分类算法
TI-FastText Automatic Goods Classification Algorithm
计算机科学, 2022, 49(6A): 206-210. https://doi.org/10.11896/jsjkx.210500089
[6] 邓凯, 杨频, 李益洲, 杨星, 曾凡瑞, 张振毓.
一种可快速迁移的领域知识图谱构建方法
Fast and Transmissible Domain Knowledge Graph Construction Method
计算机科学, 2022, 49(6A): 100-108. https://doi.org/10.11896/jsjkx.210900018
[7] 康雁, 吴志伟, 寇勇奇, 张兰, 谢思宇, 李浩.
融合Bert和图卷积的深度集成学习软件需求分类
Deep Integrated Learning Software Requirement Classification Fusing Bert and Graph Convolution
计算机科学, 2022, 49(6A): 150-158. https://doi.org/10.11896/jsjkx.210500065
[8] 邓朝阳, 仲国强, 王栋.
基于注意力门控图神经网络的文本分类
Text Classification Based on Attention Gated Graph Neural Network
计算机科学, 2022, 49(6): 326-334. https://doi.org/10.11896/jsjkx.210400218
[9] 刘硕, 王庚润, 彭建华, 李柯.
基于混合字词特征的中文短文本分类算法
Chinese Short Text Classification Algorithm Based on Hybrid Features of Characters and Words
计算机科学, 2022, 49(4): 282-287. https://doi.org/10.11896/jsjkx.210200027
[10] 钟桂凤, 庞雄文, 隋栋.
基于Word2Vec和改进注意力机制AlexNet-2的文本分类方法
Text Classification Method Based on Word2Vec and AlexNet-2 with Improved AttentionMechanism
计算机科学, 2022, 49(4): 288-293. https://doi.org/10.11896/jsjkx.211100016
[11] 武玉坤, 李伟, 倪敏雅, 许志骋.
单类支持向量机融合深度自编码器的异常检测模型
Anomaly Detection Model Based on One-class Support Vector Machine Fused Deep Auto-encoder
计算机科学, 2022, 49(3): 144-151. https://doi.org/10.11896/jsjkx.210100142
[12] 邓维斌, 朱坤, 李云波, 胡峰.
FMNN:融合多神经网络的文本分类模型
FMNN:Text Classification Model Fused with Multiple Neural Networks
计算机科学, 2022, 49(3): 281-287. https://doi.org/10.11896/jsjkx.210200090
[13] 张虎, 柏萍.
融入句子中远距离词语依赖的图卷积短文本分类方法
Graph Convolutional Networks with Long-distance Words Dependency in Sentences for Short Text Classification
计算机科学, 2022, 49(2): 279-284. https://doi.org/10.11896/jsjkx.201200062
[14] 杜少华, 万怀宇, 武志昊, 林友芳.
融合文本序列和图信息的海关商品HS编码分类
Customs Commodity HS Code Classification Integrating Text Sequence and Graph Information
计算机科学, 2021, 48(4): 97-103. https://doi.org/10.11896/jsjkx.200900053
[15] 鲁博仁, 胡世哲, 娄铮铮, 叶阳东.
面向铁路文本分类的字符级特征提取方法
Character-level Feature Extraction Method for Railway Text Classification
计算机科学, 2021, 48(3): 220-226. https://doi.org/10.11896/jsjkx.200200061
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!