计算机科学 ›› 2018, Vol. 45 ›› Issue (6): 235-240.doi: 10.11896/j.issn.1002-137X.2018.06.042

• 人工智能 • 上一篇    下一篇

基于BGRU池的卷积神经网络文本分类模型

周枫, 李荣雨   

  1. 南京工业大学计算机科学与技术学院 南京211816
  • 收稿日期:2017-05-03 出版日期:2018-06-15 发布日期:2018-07-24
  • 作者简介:周 枫(1992-),男,硕士生,主要研究领域为机器学习与深度学习;李荣雨(1977-),男,博士,副教授,主要研究领域为面向流程工业的机器学习,E-mail:alwayslry@sina.com(通信作者)
  • 基金资助:
    本文受江苏省高校自然科学基金资助项目(12KJB510007)资助

Convolutional Neural Network Model for Text Classification Based on BGRU Pooling

ZHOU Feng, LI Rong-yu   

  1. School of Computer Science and Technology,Nanjing Tech University,Nanjing 211816,China
  • Received:2017-05-03 Online:2018-06-15 Published:2018-07-24

摘要: 针对深度学习在处理文本分类问题时存在的适应度小、精确度较低等问题,提出一种采用双向门控循环单元(BGRU)进行池化的改进卷积神经网络模型。在池化阶段,将BGRU产生的中间句子表示与由卷积层得到的局部表示进行对比,将相似度高的判定为重要信息,并通过增大其权重来保留此信息。该模型可以进行端到端的训练,对多种类型的文本进行训练,适应性较强。实验结果表明,相较于其他同类模型,提出的改进模型在学习能力上有较大优势,分类精度也有显著提高。

关键词: 卷积神经网络, 深度学习, 双向门控循环单元, 文本分类

Abstract: Aiming at the problem thatdeep learning has the disadvantages of small adaptability and low precision when it solves the problem of text classification,this paper proposed a convolution neural network model based on bi-directional gated recurrent unit (BGRU) and convolution layer pooling.In the pooling stage,the intermediate sentence gene-rated by BGRU is represented as a local representation obtained from the convolution layer,the representation of high similarity is judged to be important information,and the information is retained by increasing its weight.The model can give end-to-end training and train multiple types of text,and it has good adaptability.The experimental results show that the proposed model has greate advantage compared with other similar models,and the classification accuracy is also improved significantly.

Key words: Bi-directional gated recurrent unit, Convolutional neural network, Deep learning, Text classification

中图分类号: 

  • TP183
[1]SEBASTIANI F.Machine learning in automated text categorization[J].Acm Computing Surveys,2001,34(1):1-47.
[2]LECUN Y,BENGIO Y,HINTON G.Deep learning[J].Nature,2015,521(7553):436-444.
[3]GUO L L,DING S F.Research Progress on Deep Learning[J].Computer Science,2015,42(5):28-33.(in Chinese)
郭丽丽,丁世飞.深度学习研究进展[J].计算机科学,2015,42(5):28-33.
[4]KIM Y.Convolutional neural networks for sentence classification[J].arXiv preprint arXiv:1408.5882,2014.
[5]KALCHBRENNER N,GREFENSTETTE E,BLUNSOM P.A convolutional neural network for modelling sentences[J].arXiv preprint arXiv:1404.2188,2014.
[6]QIN P,XU W,GUO J.An empirical convolutional neural network approach for semantic relation classification[J].Neurocomputing,2016,190:1-9.
[7]LECUN Y,BOTTOU L,BENGIO Y,et al.Gradient-based learning applied to document recognition[J].Proceedings of the IEEE,1998,86(11):2278-2324.
[8]WANG P,XU B,XU J,et al.Semantic expansion using word embedding clustering and convolutional neural network for improving short text classification[J].Neurocomputing,2016,174:806-814.
[9]BOUREAU Y L,ROUX N L,BACH F,et al.Ask the locals:multi-way local pooling for image recognition[C]//2011 IEEE International Conference on Computer Vision (ICCV).IEEE,2011:2651-2658.
[10]LECUN Y,BOTTOU L,BENGIO Y,et al.Gradient-based learning applied to document recognition[J].Proceedings of the IEEE,1998,86(11):2278-2324.
[11]HOCHREITER S,SCHMIDHUBER J,et al.Long short-term memory[J].Neural Computer,1998,9(8):1735-1780.
[12]CHUNG J,GULCEHRE C,CHO K H,et al.Empirical evaluation of gated recurrent neural networks on sequence modeling[J].arXiv preprint arXiv:1412.3555,2014.
[13]YONG Z,MENG J E,NING W,et al.Attention Pooling-based Convolutional Neural Network for Sentence Modelling[J].Information Sciences,2016,373(C):388-403.
[14]GREFF K,SRIVASTAVA R K,KOUTNÍK J,et al.LSTM:A search space odyssey[J].IEEE Transactions on Neural Networks and Learning Systems,2017,28(10):2222-2232.
[15]ZHANG Y,WALLACE B.A sensitivity analysis of (and practitioners’ guide to) convolutional neural networks for sentence classification[J].arXiv preprint arXiv:1510.03820,2015.
[16]MIKOLOV T,CHEN K,CORRADO G,et al.Efficient estimation of word representations in vector space[J].arXiv preprint arXiv:1301.3781,2013.
[17]ZHANG C Y,QIN P D,YI Y L.Self-adaptation Multi-gram weight learning strategy for sentence representation based on convolutional neural network[J].Computer Science,2017,44(1):60-64.(in Chinese)
张春云,秦鹏达,尹义龙.基于卷积神经网络的自适应权重multi-gram语句建模系统[J].计算机科学,2017,44(1):60-64.
[1] 饶志双, 贾真, 张凡, 李天瑞.
基于Key-Value关联记忆网络的知识图谱问答方法
Key-Value Relational Memory Networks for Question Answering over Knowledge Graph
计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[2] 汤凌韬, 王迪, 张鲁飞, 刘盛云.
基于安全多方计算和差分隐私的联邦学习方案
Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy
计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108
[3] 周乐员, 张剑华, 袁甜甜, 陈胜勇.
多层注意力机制融合的序列到序列中国连续手语识别和翻译
Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion
计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026
[4] 徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺.
时序知识图谱表示学习
Temporal Knowledge Graph Representation Learning
计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204
[5] 武红鑫, 韩萌, 陈志强, 张喜龙, 李慕航.
监督和半监督学习下的多标签分类综述
Survey of Multi-label Classification Based on Supervised and Semi-supervised Learning
计算机科学, 2022, 49(8): 12-25. https://doi.org/10.11896/jsjkx.210700111
[6] 李宗民, 张玉鹏, 刘玉杰, 李华.
基于可变形图卷积的点云表征学习
Deformable Graph Convolutional Networks Based Point Cloud Representation Learning
计算机科学, 2022, 49(8): 273-278. https://doi.org/10.11896/jsjkx.210900023
[7] 王剑, 彭雨琦, 赵宇斐, 杨健.
基于深度学习的社交网络舆情信息抽取方法综述
Survey of Social Network Public Opinion Information Extraction Based on Deep Learning
计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099
[8] 郝志荣, 陈龙, 黄嘉成.
面向文本分类的类别区分式通用对抗攻击方法
Class Discriminative Universal Adversarial Attack for Text Classification
计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077
[9] 姜梦函, 李邵梅, 郑洪浩, 张建朋.
基于改进位置编码的谣言检测模型
Rumor Detection Model Based on Improved Position Embedding
计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046
[10] 陈泳全, 姜瑛.
基于卷积神经网络的APP用户行为分析方法
Analysis Method of APP User Behavior Based on Convolutional Neural Network
计算机科学, 2022, 49(8): 78-85. https://doi.org/10.11896/jsjkx.210700121
[11] 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥.
基于注意力机制的医学影像深度哈希检索算法
Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism
计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153
[12] 孙奇, 吉根林, 张杰.
基于非局部注意力生成对抗网络的视频异常事件检测方法
Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection
计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[13] 檀莹莹, 王俊丽, 张超波.
基于图卷积神经网络的文本分类方法研究综述
Review of Text Classification Methods Based on Graph Convolutional Network
计算机科学, 2022, 49(8): 205-216. https://doi.org/10.11896/jsjkx.210800064
[14] 闫佳丹, 贾彩燕.
基于双图神经网络信息融合的文本分类方法
Text Classification Method Based on Information Fusion of Dual-graph Neural Network
计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042
[15] 侯钰涛, 阿布都克力木·阿布力孜, 哈里旦木·阿布都克里木.
中文预训练模型研究进展
Advances in Chinese Pre-training Models
计算机科学, 2022, 49(7): 148-163. https://doi.org/10.11896/jsjkx.211200018
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!