Computer Science ›› 2023, Vol. 50 ›› Issue (1): 221-228.doi: 10.11896/jsjkx.211100095

• Artificial Intelligence • Previous Articles     Next Articles

Text Classification Method Based on Bidirectional Attention and Gated Graph Convolutional Networks

ZHENG Cheng1,2, MEI Liang1,2, ZHAO Yiyan1, ZHANG Suhang1   

  1. 1 School of Computer Science and Technology,Anhui University,Hefei 230601,China
    2 Key Laboratory of Intelligent Computing and Signal Processing,Ministry of Education,Hefei 230601,China
  • Received:2021-11-08 Revised:2022-04-07 Online:2023-01-15 Published:2023-01-09
  • About author:ZHENG Cheng,born in 1964,Ph.D,associate professor.His main research interests include data mining and text analysis,natural language processing.
  • Supported by:
    Key Research and Development Projects in Anhui Province(202004d07020009).

Abstract: Existing text classification models based on graph convolutional networks usually simply fuse the neighborhood information of different orders through the adjacency matrix to update the representation of node in graph,resulting in insufficientrepresentation of the word sense information of the nodes.In addition,the model based on conventional attention mechanism only provides a positive weighted representation of the word embedding,ignoring the impact of words that produce negative effects on the final classification.To overcome the above problems,a model based on bidirectional attention mechanism and gated graph convolutional networks is proposed in the paper.Firstly,the model uses the gated graph convolutional networks to selectively fuse the multi-order neighborhood information of nodes in the graph,retaining the information of previous orders,to enrich the feature representation of nodes in graph.Secondly,the model learns the influence of different words on text classification results by the bidirectional attention mechanism,giving positive weights to words with positive effects on the classification and negative weights to words with negative effects to weaken their influence in the vector representation,to improve the model's ability to distinguish nodes with different properties in the document.Finally,the maximum pooling and average pooling are used to fuse the word representation in text to get the document representation for the final classification,where the average pooling can make each word play a role in generating a graph-level representation of the document and the maximum pooling can make the important words play a greater role in document embedding.Extensive experiments on four benchmark datasets show that the proposed model significantly outperforms the baseline model.

Key words: Text classification, Graph convolutional networks, Attention mechanism, Text representation, Deep learning, Natural language processing

CLC Number: 

  • TP391
[1]WANG Q,GARRITY G M,TIEDJE J M,et al.Naive BayesianClassifier for Rapid Assignment of rRNA Sequences into the New Bacterial Taxonomy[J].Applied and Environ-mental Microbiology,2007,73(16):5261-5267.
[2]FORMAN G.BNS Feature Scaling:An Improved Representation over TF-IDF for SVM Text Classification[C]//Proceedings of the 17th ACM Conference on Informationand Knowledge Management.New York:ACM,2008:263-270.
[3]TAN S.An Effective Refinement Strategy for KNN Text Classifier[J].Expert Systems with Applications,2006,30(2):290-298.
[4]KIPF T N,WELLING M.Semi-Supervised Classification withGraph Convolutional Networks[C]//International Conference on Learning Representation.On Line:ICLR,2017:101-112.
[5]LUONG M T,PHAM H,MANNING C D.Effective Approaches to Attention-Based Neural Machine Translation[J].Proceeding of the 2015 Conference on Empirical Methods in Natural Language Processing,2015,28(2):1412-1421.
[6]KIM Y.Convolutional Neural Networks for Sentence Classification[C]//Empirical Method in Natural Language Processing.Stroudsburg:ACL,2014:1746-1751.
[7]ZHANG X,ZHAO J,LECUN Y.Character-level convolutional networksfor text classification[C]//Conference and Workshop on Neural Information Processing Systems.Montreal:NIPS,2015:649-657.
[8]GRAVES A,JAITLY N,MOHAMED A.Hybrid Speech Recognition with Deep Bi-Directional LSTM[C]//2013 IEEE Workshop on Automatic Speech Recognition and Understan-ding.New York:IEEE,2013:273-278.
[9]CHEN K J,LIU H.Chinese Text Classification Method Based on Improved BiGRU-CNN[J].Computer Engineering,2022,48(5):59-66,73.
[10]LIU P,QIU X,CHEN X,et al.Multi-Timescale Long Short-Term Memory Neural Network for Modelling Sentences and Documents[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing.Stroudsburg:ACL,2015:2326-2335.
[11]MNIH V,HEES N,GRAVS A.Recurrent Models of Visual Attention[C]//Advances in Neural Information Processing Systems.Cambridge:MIT Press,2014:2204-2212.
[12]BAHDANAU D,CHO K,BENGIO Y.Neural Machine Translation by Jointly Learning to Align and Translate[J].Computer Science,2014,15(3):152-161.
[13]ZHOU P,SHI W,TIAN J,et al.Attention-Based BidirectionalLong Short-term Memory Networks for Relation Classification[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics.Stroudsburg:ACL,2016:207-212.
[14]YANG Z,YANG D,DYER C,et al.Hierarchical Attention Networks for Document Classification[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics.Washington:NAACL,2016:1480-1489.
[15]DING C H,XIA H B,LIU Y.Short Text Classification Model Combining Knowledge Graph and Attention Mechanism[J].Computer Engineering,2021,47(1):94-100.
[16]PENG H,LI J,HE Y,et al.Large-Scale Hierarchical Text Classification with Recursively Regularized Deep Graph-CNN [C]//Proceedings of the 2018 World Wide Web Conference.New York:ACM,2018:1063-1072.
[17]YAO L,MAO C,LUO Y.Graph Convolutional Network forText Classification[C]//Proceedings of the AAAI Conference on Artificial Intelligence.Menlo Park:AAAI,2019:7370-7377.
[18]HUANG L,MA D,LI S,et al.Text Level Graph Neural Network for Text Classification [C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing.Stroudsburg:ACL,2019:2216-2225.
[19]GILMER J,SCHOENHOLZS,RILEY P F,et al.Neural Mes-sage Passing for Quantum Chemistry[C]//International Conference on Machine Learning.New York:ACM,2017:1263-1272.
[20]YUAN Z Y,GAO S,CAO J,et al.Method for Few-Shot Short Text Classification Based on Heterogeneous Graph Convolu-tional Network[J].Computer Engineering,2021,47(12):87-94.
[21]PENNINGTON J,SOCHER R,MANNIG C D.Glove:Global Vector for Word Representation [C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Proces-sing.Stroudsburg:ACL,2014:1532-1543.
[22]CER D,YANG Y,KONG S Y,et al.Universal Sentence Encoder[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing:System Demonstrations.Stroudsburg:ACL,2018:1422-1433.
[23]TANG J,QU M,MEI Q.Pte:Predictive Text EmbeddingthroughLarge-Scale Heterogene-ous Text Network[C]//Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.New York:ACM,2015:1165-1174.
[24]JOULIN A,GRAVE E,BOJANOWKI P,et al.Bag of Tricks for Efficient Text Classification [C]//Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics.Stroudsburg:ACL,2017:427-431.
[25]SHEN D,WANG G,WANG W,et al.Baseline Needs MoreLove:On Simple Word-Embedding based Models and Associated Pooling Mechanisms[C]//Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics.Stroudsburg:ACL,2018:440-450.
[26]WANG Z,WANG C,ZHANG H,et al.Learning Dynamic Hie-rarchical Topic Graph with Graph Convolutional Network for Document Classification[C]//International Conference on Artificial Intelligence and Statistics.BOSTON:JMLR,2020:3959-3969.
[27]ZHU H,KONIUSZ P.Simple Spectral Graph Convolution[C]//International Conference on Learning Representation.On Line:ICLR,2021:151-163.
[28]XIE Q,HUANG J,DU P,et al.Inductive Topic VariationalGraph Auto-Encoder for Text Classification[C]//Proceeding of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics.Washington:NAACL,2021:4218-4227.
[29]KINGMA D,BA J.Adam:A Method for Stochastic Optimization[J].Computer Science,2014,8(2):4123-4131.
[1] LI Xuehui, ZHANG Yongjun, SHI Dianxi, XU Huachi, SHI Yanyan. AFTM:Anchor-free Object Tracking Method with Attention Features [J]. Computer Science, 2023, 50(1): 138-146.
[2] ZHAO Qian, ZHOU Dongming, YANG Hao, WANG Changchen. Image Deblurring Based on Residual Attention and Multi-feature Fusion [J]. Computer Science, 2023, 50(1): 147-155.
[3] SUN Kaili, LUO Xudong , Michael Y.LUO. Survey of Applications of Pretrained Language Models [J]. Computer Science, 2023, 50(1): 176-184.
[4] LIANG Haowei, WANG Shi, CAO Cungen. Study on Short Text Classification with Imperfect Labels [J]. Computer Science, 2023, 50(1): 185-193.
[5] LI Xiaoling, WU Haotian, ZHOU Tao, LU Hui. Password Guessing Model Based on Reinforcement Learning [J]. Computer Science, 2023, 50(1): 334-341.
[6] CAI Xiao, CEHN Zhihua, SHENG Bin. SPT:Swin Pyramid Transformer for Object Detection of Remote Sensing [J]. Computer Science, 2023, 50(1): 105-113.
[7] ZHANG Jingyuan, WANG Hongxia, HE Peisong. Multitask Transformer-based Network for Image Splicing Manipulation Detection [J]. Computer Science, 2023, 50(1): 114-122.
[8] WANG Bin, LIANG Yudong, LIU Zhe, ZHANG Chao, LI Deyu. Study on Unsupervised Image Dehazing and Low-light Image Enhancement Algorithms Based on Luminance Adjustment [J]. Computer Science, 2023, 50(1): 123-130.
[9] ZHOU Fang-quan, CHENG Wei-qing. Sequence Recommendation Based on Global Enhanced Graph Neural Network [J]. Computer Science, 2022, 49(9): 55-63.
[10] DAI Yu, XU Lin-feng. Cross-image Text Reading Method Based on Text Line Matching [J]. Computer Science, 2022, 49(9): 139-145.
[11] ZHOU Le-yuan, ZHANG Jian-hua, YUAN Tian-tian, CHEN Sheng-yong. Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion [J]. Computer Science, 2022, 49(9): 155-161.
[12] XU Yong-xin, ZHAO Jun-feng, WANG Ya-sha, XIE Bing, YANG Kai. Temporal Knowledge Graph Representation Learning [J]. Computer Science, 2022, 49(9): 162-171.
[13] XIONG Li-qin, CAO Lei, LAI Jun, CHEN Xi-liang. Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization [J]. Computer Science, 2022, 49(9): 172-182.
[14] RAO Zhi-shuang, JIA Zhen, ZHANG Fan, LI Tian-rui. Key-Value Relational Memory Networks for Question Answering over Knowledge Graph [J]. Computer Science, 2022, 49(9): 202-207.
[15] TANG Ling-tao, WANG Di, ZHANG Lu-fei, LIU Sheng-yun. Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy [J]. Computer Science, 2022, 49(9): 297-305.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!