计算机科学 ›› 2022, Vol. 49 ›› Issue (6): 326-334.doi: 10.11896/jsjkx.210400218

• 人工智能 • 上一篇    下一篇

基于注意力门控图神经网络的文本分类

邓朝阳1, 仲国强1, 王栋2   

  1. 1 中国海洋大学信息科学与工程学院 山东 青岛 266100
    2 中国海洋大学图书馆 山东 青岛 266100
  • 收稿日期:2021-04-20 修回日期:2021-06-12 出版日期:2022-06-15 发布日期:2022-06-08
  • 通讯作者: 王栋(wangdong@ouc.edu.cn)
  • 作者简介:(zydeng@stu.ouc.edu.cn)
  • 基金资助:
    科技创新2030——“新一代人工智能”重大项目(2018AAA0100400);装备预研教育部联合基金项目(6141A020337);山东省自然科学基金项目(ZR2020MF131);青岛市科技计划项目(21-1-4-ny-19-nsh)

Text Classification Based on Attention Gated Graph Neural Network

DENG Zhao-yang1, ZHONG Guo-qiang1, WANG Dong2   

  1. 1 School of Information Science and Engineering,Ocean University of China,Qingdao,Shandong 266100,China
    2 Library of Ocean University of China,Qingdao,Shandong 266100,China
  • Received:2021-04-20 Revised:2021-06-12 Online:2022-06-15 Published:2022-06-08
  • About author:DENG Zhao-yang,born in 1995,postgraduate.His main research interests include deep learning and graph neural network.
    WANG Dong,born in 1979,Ph.D,senior engineer.His main research interests include machine vision,embedded system,software programming and IoT design.
  • Supported by:
    Major Project for New Generation of AI (2018AAA0100400), Joint Fund of the Equipments Pre-Research and Ministry of Education of China (6141A020337), Natural Science Foundation of Shandong Province (ZR2020MF131) and Science and Technology Program of Qingdao (21-1-4-ny-19-nsh).

摘要: 针对现有的文本分类工作在生成文本表示时通常忽略单词之间语义交互的问题,提出了一种新的基于注意力门控图神经网络的文本分类模型,有效地利用单词的语义特征并在充分语义交互的基础上提高了文本分类的准确率。首先,将每个输入文本转换为独立图结构数据并提取单词节点的语义特征;其次,利用注意力门控图神经网络对单词节点的语义特征进行交互和更新;然后,使用基于注意力机制的文本池化模块提取语义特征具有判别性的单词节点,以构建文本图表示;最后,基于文本图表示实现有效的文本分类。实验结果表明,所提方法在文本数据集Ohsumed,R8,R52和MR上的准确率分别达到了70.83%,98.18%,94.72%与80.03%,优于现有的方法。

关键词: 深度学习, 图神经网络, 文本分类, 注意力机制

Abstract: To address the problem that the existing text classification work usually ignores the semantic interaction between words when generating text representation,this paper proposes a novel text classification model based on attention gated graph neural network.It makes effective use of the semantic features of words and improves the accuracy of text classification based on the adequate semantic interaction.Firstly,each input text is converted to a single graph-structured data and the semantic features of word nodes are extracted.Secondly,attention gated graph neural network is used to interact and update the semantic features of word nodes.In addition,the attention-based text pooling module is used to extract the word nodes with discriminative semantic features to construct text graph representation.Finally,effective text classification is implemented based on the text graph representation.Experimental results show that the proposed method achieves an accuracy of 70.83%,98.18%,94.72% and 80.03% on Ohsumed,R8,R52 and MR datasets,respectively,and outperforms existing methods.

Key words: Attention mechanism, Deep learning, Graph neural network, Text classification

中图分类号: 

  • TP183
[1] CHARU C A,ZHAI C X.A survey of text classification algorithms[M]//Mining Text Data.Berlin:Springer,2012:163-222.
[2] YAN M.Research on spam filtering algorithm based on fast-Text[D].Guangzhou:South China University of Technology,2020.
[3] GRIFFITHS T L,STEYVERS M.Finding scientific topics[J].Proceedings of the National Academy of Sciences,2004,101(suppl. 1):5228-5235.
[4] XIA R,ZONG C Q,LI S S.Ensemble of feature sets and classification algorithms for sentiment classification[J].Information Sciences,2011,181(6):1138-1152.
[5] ZHOU W X,LAN W F.Summarization Model Using Multi-Task Learning Fused with Text Classification[J].Computer Engineering,2021,47(4):48-55.
[6] SWINIARSKI R W,SKOWRON A.Rough set methods in feature selection and recognition[J].Pattern Recognition Letters,2003,24(6):833-849.
[7] WONG P C,WHITNEY P,THOMAS J.Visualizing association rules for text mining[C]//Proceedings of the IEEE Symposium on Information Visualization.Piscataway,NJ:IEEE Press,1999:120-123.
[8] PENG F C,SCHUURMANS D.Combining naive Bayes andn-gram language models for text classification[C]//Proceedings of the European Conference on Information Retrieval.Berlin:Springer,2003:335-350.
[9] JOACHIMS T.Text categorization with support vector ma-chines:learning with many relevant features[C]//Proceedings of the European Conference on Machine Learning.Berlin:Springer,1998:137-142.
[10] KANG H S,NAM K,KIM S.The decomposed k-nearest neighbor algorithm for imbalanced text classification[C]//Procee-dings of the International Conference on Future Generation Information Technology.Berlin:Springer,2012:87-94.
[11] LI C B,DUAN Q J,JI C H,et al.Method of Short Text Classification Based on CHI and TF-IWF Feature Selection[J].Journal of Chongqing University of Technology(Nature Science),2021,35(5):135-140.
[12] LECUN Y,BOTTOU L,BENGIO Y,et al.Gradient-based lear-ning applied to document recognition[J].Proceedings of the IEEE,1998,86(11):2278-2324.
[13] MIKOLOV T,KARAFIAT M,BURGET L,et al.Recurrentneural network based language model[C]//Proceedings of the 11th Annual Conference of the International Speech Communication Association.2010:1045-1048.
[14] KIM Y.Convolutional neural networks for sentence classification[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing.2014:1746-1751.
[15] LIU P F,QIU X P,HUANG X J.Recurrent neural network for text classification with multi-task learning[C]//Proceedings of the 25th International Joint Conference on Artificial Intelligence.Palo Alto,CA:AAAI Press,2016:2873-2879.
[16] WANG Y,HE Y M,CHEN H X,et al.RHS-CNN:A CNN Text Classification Model Based on Regularized Hierarchical Softmax[J].Journal of Chongqing University of Technology(Natural Science),2020,34(5):187-195.
[17] MA Z K,DILIYAER P,ZAOKERE K,et al.A ClassificationAlgorithm for Tourist Question Texts Integrated with Deep Learning Models[J].Computer Engineering,2020,46(11):70-76.
[18] YAO L,MAO C S,LUO Y.Graph convolutional networks for text classification[C]//Proceedings of the 33rd AAAI Confe-rence on Artificial Intelligence.Palo Alto,CA:AAAI Press,2019:7370-7377.
[19] KIPF T N,WELLING M.Semi-supervised classification withgraph convolutional networks[C]//Proceedings of the 5th International Conference on Learning Representations.2017.
[20] HUANGL Z,MA D,LI S J,et al.Text level graph neural network for text classification[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing.Stroudsburg,PA:Association for Computational Linguistics,2019:3444-3450.
[21] VELICKOVIC P,CUCURULL G,CASANOVA A,et al.Graph attention networks[C]//Proceedings of the 6th International Conference on Learning Representations.2018.
[22] VASWANI A,SHAZEER N,PARMERN,et al.Attention is all you need[C]//Proceedings of the 31th International Conference on Neural Information Processing Systems.Cambridge,MA:MIT Press,2017:6000-6010.
[23] TANG D Y,QIN B,LIU T.Document modeling with gated recurrent neural network for sentiment classification[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing.2015:1422-1432.
[24] PENNINGTON J,SOCHER R,MANNING C D.Glove:global vectors for word representation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing.2014:1532-1543.
[25] CHOK,VAN MERRIENBOER B,GULCEHRE C,et al.Lear-ning phrase representations using rnn encoder-decoder for statistical machine translation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing.2014:1724-1734.
[26] SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[J].arXiv:1409.1556,2014.
[27] SZEGEDY C,LIU W,JIA Y Q,et al.Going deeper with convolutions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Piscataway,NJ:IEEE Press,2015:1-9.
[28] LI Q M,HAN Z C,WU X M.Deeper insights into graph convolutional networks for semi-supervised learning[C]//Procee-dings of the AAAI Conference on Artificial Intelligence.Palo Alto,CA:AAAI Press,2018:3538-3545.
[29] DE BOER P T,KROESE D P,MANNOR S,et al.A tutorial on the cross-entropy method[J].Annals of Operations Research,2005,134(1):19-67.
[30] HERSH W,BUCKLEY C,LEONE T J,et al.Ohsumed:an interactive retrieval evaluation and new large test collection for research[C]//Proceedings of the International Conference on Research and Development in Information Retrieval(SIGIR’94).London:Springer,1994:192-201.
[31] APTE C,DAMERAU F,WEISS S M.Automated learning ofdecision rules for text categorization[J].ACM Transactions on Information Systems,1994,12(3):233-251.
[32] APTE C,DAMERAU F,WEISS S M.Towards language independent automated learning of text categorization models[C]//Proceedings of the International Conference on Research and Development in Information Retrieval(SIGIR’94).London:Springer,1994:23-30.
[33] PANG B,LEE L,VAITHYANATHAN S.Thumbs up? Sentiment classification using machine learning techniques[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing.2002:79-86.
[34] TANG J,QU M,MEI Q Z.Pte:predictive text embeddingthrough large-scale heterogeneous text networks[C]//Procee-dings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.New York:ACM Press,2015:1165-1174.
[35] ROUSSEAU F,KIAGIASE,VAZIRGIANNIS M.Text categorization as a graph classification problem[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics.2015:1702-1712.
[36] BLANCO R,LIOMA C.Graph-based term weighting for information retrieval[J].Information Retrieval,2012,15(1):54-92.
[37] JOULIN A,GRAVE E,BOJANOWSKI P,et al.Bag of tricks for efficient text classification[J].arXiv:1607.01759,2016.
[38] SHEN D H,WANG G Y,WANG W L,et al.Baseline needs more love:on simple word-embedding-based models and associated pooling mechanisms[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics.2018:440-450.
[39] DEFFERRARD M,BRESSON X,VANDERGHEYNST P.Convolutional neural networks on graphs with fast localized spectral filtering[C]//Proceedings of the 30th International Conference on Neural Information Processing Systems.Cambridge,MA:MIT Press,2016:3844-3852.
[40] KINGMA D P,BA J.Adam:A method for stochastic optimization[J].arXiv:1412.6980,2014.
[41] GLOROT X,BENGIO Y.Understanding the difficulty of trai-ning deep feedforward neural networks[C]//Proceedings of the 13th International Conference on Artificial Intelligence and Statistics.2010:249-256.
[42] HINTON G E,SRIVASTAVA N,KRIZHEVSKY A,et al.Im-proving neural networks by preventing co-adaptation of feature detectors[J].arXiv:1207.0580,2012.
[43] LEE J,LEE I,KANG J.Self-attention graph pooling[C]//Proceedings of the International Conference on Machine Learning.2019:3734-3743.
[1] 周芳泉, 成卫青.
基于全局增强图神经网络的序列推荐
Sequence Recommendation Based on Global Enhanced Graph Neural Network
计算机科学, 2022, 49(9): 55-63. https://doi.org/10.11896/jsjkx.210700085
[2] 戴禹, 许林峰.
基于文本行匹配的跨图文本阅读方法
Cross-image Text Reading Method Based on Text Line Matching
计算机科学, 2022, 49(9): 139-145. https://doi.org/10.11896/jsjkx.220600032
[3] 周乐员, 张剑华, 袁甜甜, 陈胜勇.
多层注意力机制融合的序列到序列中国连续手语识别和翻译
Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion
计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026
[4] 徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺.
时序知识图谱表示学习
Temporal Knowledge Graph Representation Learning
计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204
[5] 熊丽琴, 曹雷, 赖俊, 陈希亮.
基于值分解的多智能体深度强化学习综述
Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization
计算机科学, 2022, 49(9): 172-182. https://doi.org/10.11896/jsjkx.210800112
[6] 饶志双, 贾真, 张凡, 李天瑞.
基于Key-Value关联记忆网络的知识图谱问答方法
Key-Value Relational Memory Networks for Question Answering over Knowledge Graph
计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[7] 汤凌韬, 王迪, 张鲁飞, 刘盛云.
基于安全多方计算和差分隐私的联邦学习方案
Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy
计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108
[8] 王剑, 彭雨琦, 赵宇斐, 杨健.
基于深度学习的社交网络舆情信息抽取方法综述
Survey of Social Network Public Opinion Information Extraction Based on Deep Learning
计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099
[9] 郝志荣, 陈龙, 黄嘉成.
面向文本分类的类别区分式通用对抗攻击方法
Class Discriminative Universal Adversarial Attack for Text Classification
计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077
[10] 姜梦函, 李邵梅, 郑洪浩, 张建朋.
基于改进位置编码的谣言检测模型
Rumor Detection Model Based on Improved Position Embedding
计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046
[11] 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥.
基于注意力机制的医学影像深度哈希检索算法
Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism
计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153
[12] 孙奇, 吉根林, 张杰.
基于非局部注意力生成对抗网络的视频异常事件检测方法
Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection
计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[13] 檀莹莹, 王俊丽, 张超波.
基于图卷积神经网络的文本分类方法研究综述
Review of Text Classification Methods Based on Graph Convolutional Network
计算机科学, 2022, 49(8): 205-216. https://doi.org/10.11896/jsjkx.210800064
[14] 闫佳丹, 贾彩燕.
基于双图神经网络信息融合的文本分类方法
Text Classification Method Based on Information Fusion of Dual-graph Neural Network
计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042
[15] 武红鑫, 韩萌, 陈志强, 张喜龙, 李慕航.
监督和半监督学习下的多标签分类综述
Survey of Multi-label Classification Based on Supervised and Semi-supervised Learning
计算机科学, 2022, 49(8): 12-25. https://doi.org/10.11896/jsjkx.210700111
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!