Computer Science ›› 2019, Vol. 46 ›› Issue (11): 186-192.doi: 10.11896/jsjkx.180901702

• Artificial Intelligence • Previous Articles     Next Articles

DC-BiGRU_CNN Model for Short-text Classification

ZHENG Cheng, XUE Man-yi, HONG Tong-tong, SONG Fei-bao   

  1. (School of Computer Science and Technology,Anhui University,Hefei 230601,China)
    (Key Laboratory of Intelligent Computing & Signal Processing,Ministry of Education,Hefei 230601,China)
  • Received:2018-09-11 Online:2019-11-15 Published:2019-11-14

Abstract: Text classification is a basic task in natural language processing.Nowadays,it is more and more popular to use deep learning technology to deal with text classification tasks.When processing text sequences,convolutional neural networks can extract local features,and recurrent neural networks can extract global features,all of which show good effect.However,convolutional neural networks can not capture the context-related semantic information of text very well,and recurrent networks are not sensitive to the key semantic information.In addition,although deeper networks can better extract features,they are prone to gradient disappearance or gradient explosion.To solve these problems,this paper proposed a hybrid model based on densely connected gated recurrent unit convolutional networks (DC-BiGRU_CNN).Firstly,a standard convolutional neural network is used to train the character-level word vector,and then the character-level word vector is spliced with the word-level word vector to form the network input layer.Inspired by the densely connected convolutional network,a proposed densely connected bidirectional gated recurrent unit is used in the stage of high-level semantic modeling of text,which can alleviate the defect of gradient disappearance or gradient explosion and enhance the transfer between features of each layer,thus achieving feature reuse.Next,the convolution and pooling operation are conducted for the deep high-level semantic representation to obtain the final semantic feature representation,which is then input to the softmax layer to complete text classification task.The experimental results on several public datasets show that DC-BiGRU_CNN has a significant performance improvement in terms of the accuracy for text classification tasks.In addition,this paper analyzed the effect of different components of the model onperfor-mance improvement,and studied the effect of parameters such as the maximum length of sentence,the number of layers ofthe network and the size of the convolution kernel on the model.

Key words: Bi-directional gated recurrent unit, Character-level word vector, Convolutional neural network, Dense connection, Text classification

CLC Number: 

  • TP391.1
[1]JOACHIMS T.Text categorization with Support Vector Machines:Learning with many relevant features[C]∥European Conference on Machine Learning.Berlin:Springer,1998:137-142.
[2]CHEN Z,SHI G,WANG X.Text Classification Based on Naive Bayes Algorithm with Feature Selection[J].International Journal on Information,2012,15(10):4255-4260.
[3]VRIES A D,MAMOULIS N,NES N,et al.Efficient KNNsearch on vertically decomposed data[C]∥Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data.Madiso:ACM Press,2002:322-333.
[4]TOMAS M,SUTSKEVER I,CHEN K,et al.Distributed representations of words and phrases and their compositionality[C]∥In Advances in Neural Information Processing Systems.2013:3111-3119.
[5]PENNINGTON J,SOCHER R,MANNING C D.Glove:Global vectors for word representation[C]∥Empirical Methods in Natural Language Processing (EMNLP).2014:1532-1543.
[6]KIM Y.Convolutional Neural Networks for Sentence Classification[C]∥Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing.Doha,Qatar,2014:1746-1751.
[7]KALCHBRENNER N,GREFENSTETTE E,BLUNSOM P.A Convolutional Neural Network for Modelling Sentences[C]∥Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics.Baltimore,Maryland,2014:655-665.
[8]ZHANG X,ZHAO J B,LECUN Y.Character-level convolutionalnetworks for text classification[C]∥Proceedings of the International Conference on Neural Information Processing Systems.Montreal,2015:649-657.
[9]LIU L F,YANG L,ZHANG S W,et al.Convolutional Neural Networks for Chinese Micro-blog Sentiment Analysis[J].Journal of Chinese Information Processing,2015,29(6):141-149.(in Chinese)
刘龙飞,杨亮,张绍武,等.基于卷积神经网络的微博情感倾向性分析[J].中文信息学报,2015,29(6):141-149.
[10]SANTOS C,GATTI M.Deep Convolutional Neural Networks for Sentiment Analysis of Short Texts[C]∥Proc ofInternationalConference on Computational Linguistics.2014:69-78.
[11]ZHANG Y,CHEN G G,YU D,et al.Highway long short-term memory RNNS for distant speech recognition[C]∥IEEE International Conference on Acoustics,Speech and Signal Processing.IEEE,2016:5755-5759.
[12]YANG Z,YANG D,DYER C,et al.Hierarchical attention networks for document classification[C]∥Proceedings of the Conference on the North American Chapter of the Association for Computational Linguistics.Human Language Technologies,2016:1480-1489.
[13]NIE Y,BANSAL M.Shortcut-Stacked Sentence Encoders forMulti-Domain Inference[C]∥The Workshop on Evaluating Vector Space Representations for Nlp.2017:41-45.
[14]QIAN Q,HUANG M,LEI J H,et al.Linguistically regularized LSTMs for Sentiment Classification[C]∥Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.Canada:ACL,2017:1679-1689.
[15]ZHOU P,QI Z,ZHENG S,et al.Text Classification Improved by Integrating Bidirectional LSTM with Two-dimensional Max Pooling[C]∥Proceedings of COLING 2016,the 26th Inter-national Conference on Computational Linguistics:Technical Papers.2016:3485-3495.
[16]JOHNSON R,ZHANG T.Deep Pyramid Convolutional Neural Networks for Text Categorization[C]∥Meeting of the Association for Computational Linguistics.2017:562-570.
[17]CONNEAU A,SCHWENK H,BARRAULT L,et al.VeryDeep Convolutional Networks for Text Classification[C]∥Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics.2016:1107-1116.
[18]HUANG G,LIU Z,MAATEN V D L,et al.Densely connected convolutional networks[C]∥In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition.Hawaii,USA:IEEE,2017:2261-2269.
[19] YIN W,SCHUTZE H.Multichannel variable-size convolution for sentence classification[C]∥Proceedings of the Conference on Natural Language Learning (CoNLL).2015:204-214.
[20]TAI K S,SOCHER R,MANNING C D.Improved semantic representations from tree-structured long short-term memory networks[C]∥Annual Meeting of the Association for Computational Linguistics(ACL 2015).Beijing,China,2015:1556-1566. [21]LIU P,QIU X,HUANG X.Recurrent neural network for text classification with multi-task learning[C]∥International Joint Conference on Artificial Intelligence.AAAI Press,2016:2873-2879.
[22]ZHOU C,SUN C,LIU Z,et al.A C-LSTM Neural Network for Text Classification [J].Computer Science,2015,1(4):39-44.
[23]ZHANG R,LEE H,RADEV D.Dependency sensitive convolutional neural networks for modeling sentences and documents[C]∥Proceedings of the 15th Conference of the North American Chapter of the Association for Computational Linguistics.Stroudsburg,PA:ACL,2016:1512-1521.
[24]WANG C L,JIANG F J,YANG H X.A hybrid framework for text modeling with convolutional rnn[C]∥Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.ACM,2017:2061-2069.
[1] ZHOU Le-yuan, ZHANG Jian-hua, YUAN Tian-tian, CHEN Sheng-yong. Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion [J]. Computer Science, 2022, 49(9): 155-161.
[2] HAO Zhi-rong, CHEN Long, HUANG Jia-cheng. Class Discriminative Universal Adversarial Attack for Text Classification [J]. Computer Science, 2022, 49(8): 323-329.
[3] CHEN Yong-quan, JIANG Ying. Analysis Method of APP User Behavior Based on Convolutional Neural Network [J]. Computer Science, 2022, 49(8): 78-85.
[4] ZHU Cheng-zhang, HUANG Jia-er, XIAO Ya-long, WANG Han, ZOU Bei-ji. Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism [J]. Computer Science, 2022, 49(8): 113-119.
[5] TAN Ying-ying, WANG Jun-li, ZHANG Chao-bo. Review of Text Classification Methods Based on Graph Convolutional Network [J]. Computer Science, 2022, 49(8): 205-216.
[6] YAN Jia-dan, JIA Cai-yan. Text Classification Method Based on Information Fusion of Dual-graph Neural Network [J]. Computer Science, 2022, 49(8): 230-236.
[7] WU Hong-xin, HAN Meng, CHEN Zhi-qiang, ZHANG Xi-long, LI Mu-hang. Survey of Multi-label Classification Based on Supervised and Semi-supervised Learning [J]. Computer Science, 2022, 49(8): 12-25.
[8] DAI Zhao-xia, LI Jin-xin, ZHANG Xiang-dong, XU Xu, MEI Lin, ZHANG Liang. Super-resolution Reconstruction of MRI Based on DNGAN [J]. Computer Science, 2022, 49(7): 113-119.
[9] LIU Yue-hong, NIU Shao-hua, SHEN Xian-hao. Virtual Reality Video Intraframe Prediction Coding Based on Convolutional Neural Network [J]. Computer Science, 2022, 49(7): 127-131.
[10] XU Ming-ke, ZHANG Fan. Head Fusion:A Method to Improve Accuracy and Robustness of Speech Emotion Recognition [J]. Computer Science, 2022, 49(7): 132-141.
[11] SHAO Xin-xin. TI-FastText Automatic Goods Classification Algorithm [J]. Computer Science, 2022, 49(6A): 206-210.
[12] YANG Yue, FENG Tao, LIANG Hong, YANG Yang. Image Arbitrary Style Transfer via Criss-cross Attention [J]. Computer Science, 2022, 49(6A): 345-352.
[13] YANG Jian-nan, ZHANG Fan. Classification Method for Small Crops Combining Dual Attention Mechanisms and Hierarchical Network Structure [J]. Computer Science, 2022, 49(6A): 353-357.
[14] DENG Kai, YANG Pin, LI Yi-zhou, YANG Xing, ZENG Fan-rui, ZHANG Zhen-yu. Fast and Transmissible Domain Knowledge Graph Construction Method [J]. Computer Science, 2022, 49(6A): 100-108.
[15] KANG Yan, WU Zhi-wei, KOU Yong-qi, ZHANG Lan, XIE Si-yu, LI Hao. Deep Integrated Learning Software Requirement Classification Fusing Bert and Graph Convolution [J]. Computer Science, 2022, 49(6A): 150-158.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!