计算机科学 ›› 2022, Vol. 49 ›› Issue (3): 281-287.doi: 10.11896/jsjkx.210200090

所属专题: 自然语言处理 虚拟专题

• 人工智能 • 上一篇    下一篇

FMNN:融合多神经网络的文本分类模型

邓维斌, 朱坤, 李云波, 胡峰   

  1. 重庆邮电大学计算智能重庆市重点实验室 重庆400065
  • 收稿日期:2021-02-09 修回日期:2021-05-03 出版日期:2022-03-15 发布日期:2022-03-15
  • 通讯作者: 朱坤(1209562838@qq.com)
  • 作者简介:(dengwb@cqupt.edu.cn)
  • 基金资助:
    国家重点研发计划(2018YFC0832100,2018YFC0832102);国家自然科学重点基金项目(61936001)

FMNN:Text Classification Model Fused with Multiple Neural Networks

DENG Wei-bin, ZHU Kun, LI Yun-bo, HU Feng   

  1. Chongqing Key Laboratory of Computational Intelligence,Chongqing University of Posts and Telecommunications,Chongqing 400065,China
  • Received:2021-02-09 Revised:2021-05-03 Online:2022-03-15 Published:2022-03-15
  • About author:DENG Wei-bin,born in 1978,Ph.D,professor.His main research interests include intelligent information proces-sing,natural language processing and uncertainty decision-making.
    ZHU Kun,born in 1997,postgraduate.His main research interests include na-tural language processing and intelligent information processing.
  • Supported by:
    National Key Research and Development Program of China(2018YFC0832100,2018YFC0832102) and Key Program of National Natural Science Foundation of China(61936001).

摘要: 文本分类是自然语言处理中一项基本且重要的任务。基于深度学习的文本分类方法大多只针对单一的模型结构进行深入研究,这种单一的结构缺乏同时捕获并利用全局语义特征与局部语义特征的能力,且网络的加深会损失更多的语义信息。对此,提出了一种融合多神经网络的文本分类模型FMNN(A Text Classification Model Fused with Multiple Neural Network),FMNN在最大限度减小网络深度的同时,融合了BERT,RNN,CNN和Attention等神经网络模型的特性。用BERT作为嵌入层获得文本的矩阵表示,用BiLSTM和Attention联合提取文本的全局语义特征,用CNN提取文本多个粒度下的局部语义特征,将全局语义特征和局部语义特征分别作用于softmax分类器,最后采用算术平均的方式对结果进行融合。在3个公开数据集和1个司法数据集上的实验结果表明,FMNN模型实现了更高的文本分类准确率,其中在司法数据集上的准确率达到了90.31%,证明了该模型具有较好的实用价值。

关键词: 局部语义特征, 全局语义特征, 融合, 深度学习, 文本分类, 语义损失

Abstract: Text classification is a basic and important task in natural language processing.Most of the text classification methods based on deep learning only focus on a single model structure.The single structure lacks the ability to simultaneously capture and utilize both global and local semantic features.Besides,the deepening of the network will lose more semantic information.In order to overcome the above problems,a text classification model FMNN which is a text classification model fused with multiple neural network is proposed in this paper.The model combines the performances of BERT,RNN,CNN and Attention while minimizing the network depth.BERT is used as the embedding layer to obtain the matrix representation of the text.BiLSTM and Attention are used to jointly extract the global semantic features of the text.CNN is used to extract the local semantic features of the text at multiple granularities.The global semantic features and local semantic features are applied to the softmax classifier respectively.The results are finally fused by arithmetic average.The experimental results on three public data sets and one judicial data set show that the proposed FMNN model achieves higher accuracy rate,and the accuracy rate on the judicial data set reaches 90.31%,which proves that the model has good practical value.

Key words: Deep learning, Fusion, Global semantic features, Local semantic features, Semantic loss, Text classification

中图分类号: 

  • TP391
[1]TAI K S,SOCHER R,MANNING C D.Improved semantic rep-resentations from tree-structured long short-term memory network[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th Internatio-nal Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing.ACL,2015:1556-1566.
[2]KALCHBRENNER N,GREFENSTETTE E,BLUNSOM P.A convolutional neural network for modelling sentences[C]//Proceedings of 52th Annual Meeting of the Association for Computational Linguistics.ACL,2014:1-11.
[3]KIM Y.Convolutional neural networks for sentence classification[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing(EMNLP).2014:1746-1751.
[4]JOHNSON R,ZHANG T.Deep pyramid convolutional neuralnetworks for text categorization[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.2017:526-570.
[5]CONNEAU A,SCHWENK H,BARRAULT L,et al.VeryDeep Convolutional Networks for Text Classification[C]//Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics.2017:1107-1116.
[6]LAI S,XU L,LIU K,et al.Recurrent convolutional neural networks for text classification[C]//Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence.2015:2267-2273.
[7]LIU P,QIU X,HUANG X.Recurrent neural network for text classification with multi-task learning[C]//Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence.2016:2873-2879.
[8]YANG Z,YANG D,DYER C,et al.Hierarchical attention networks for document classification[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2016:1480-1489.
[9]VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[C]//Advances in Neural Information Processing Systems.2017:5998-6008.
[10]KIM S,KANG I,KWAK N.Semantic sentence matching with densely-connected recurrent and co-attentive information[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2019:6586-6593.
[11]MIKOLOV T,CHEN K,CORRADO G,et al.Efficient estimation of word representations in vector space[J].arXiv:1301.3781,2013.
[12]PENNINGTON J,SOCHER R,MANNINGC D.Glove:Global vectors for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Proces-sing (EMNLP).2014:1532-1543.
[13]RADFORD A,WU J,CHILD R,et al.Language models are unsupervised multitask learners[J].OpenAI Blog,2019,1(8):9.
[14]DEVLIN J,CHANG M W,LEE K,et al.BERT:Pre-training of Deep Bidirectional Transformers for Language Understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2019:4171-4186.
[15]LI K Y,CHEN Y,NIU S Z.Social E-commerce Text Classification Algorithm Based on BERT[J].Computer Science,2021,48(2):87-92.
[16]RASMY L,XIANG Y,XIE Z Q,et al.Med-BERT:pretrainedcontextualized embeddings on large-scale structured electronic health records for disease prediction[J].NPJ Digital Medicine,2021,4(1):1-13.
[17]WENG X F,ZHAO J H,JIANG C X,et al.Research on sentiment classification of futures predictive texts based on BERT[J/OL].Computing,2021.https://doi.org/10.1007/s00607-021-00989-9.
[18]CUI Y,CHE W,LIU T,et al.Pre-training with whole wordmasking for chinese bert[J].arXiv:1906.08101,2019.
[19]LAN Z,CHEN M,GOODMAN S,et al.Albert:A lite bert for self-supervised learning of language representations[C]//Proceedings of the 8th International Conference on Learning Representations.ICLR,2020:1-17.
[20]JOSHI M,CHEN D,LIU Y,et al.Spanbert:Improving pre-training by representing and predicting spans[J].Transactions of the Association for Computational Linguistics,2020,8:64-77.
[21]SOCHER R,PERELYGIN A,WU J,et al.Recursive deep mo-dels for semantic compositionality over a sentiment treebank[C]//Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing.2013:1631-1642.
[22]ZHANG X,ZHAO J,LE C Y.Character-level convolutional networks for text classification[J].Advances in Neural Information Processing Systems,2015,28:649-657.
[23]DIAO Q,QIU M,WU C Y,et al.Jointly modeling aspects,ra-tings and sentiments for movie recommendation (JMARS)[C]//Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2014:193-202.
[24]CUI Y,CHE W,LIU T,et al.Revisiting Pre-Trained Models for Chinese Natural Language Processing[J].arXiv:2004.13922,2020.
[25]JOULIN A,GRAVE É,BOJANOWSKI P,et al.Bag of Tricksfor Efficient Text Classification[C]//Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics,Short Papers.2017:427-431.
[26]ZHOU P,SHI W,TIAN J,et al.Attention-based bidirectionallong short-term memory networks for relation classification[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Short Papers).2016:207-212.
[27]LIU G,GUO J.Bidirectional LSTM with attention mechanism and convolutional layer for text classification[J].Neurocompu-ting,2019,337:325-338.
[1] 曹晓雯, 梁美玉, 鲁康康.
基于细粒度语义推理的跨媒体双路对抗哈希学习模型
Fine-grained Semantic Reasoning Based Cross-media Dual-way Adversarial Hashing Learning Model
计算机科学, 2022, 49(9): 123-131. https://doi.org/10.11896/jsjkx.220600011
[2] 周乐员, 张剑华, 袁甜甜, 陈胜勇.
多层注意力机制融合的序列到序列中国连续手语识别和翻译
Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion
计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026
[3] 徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺.
时序知识图谱表示学习
Temporal Knowledge Graph Representation Learning
计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204
[4] 饶志双, 贾真, 张凡, 李天瑞.
基于Key-Value关联记忆网络的知识图谱问答方法
Key-Value Relational Memory Networks for Question Answering over Knowledge Graph
计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[5] 吴子仪, 李邵梅, 姜梦函, 张建朋.
基于自注意力模型的本体对齐方法
Ontology Alignment Method Based on Self-attention
计算机科学, 2022, 49(9): 215-220. https://doi.org/10.11896/jsjkx.210700190
[6] 汤凌韬, 王迪, 张鲁飞, 刘盛云.
基于安全多方计算和差分隐私的联邦学习方案
Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy
计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108
[7] 秦琪琦, 张月琴, 王润泽, 张泽华.
基于知识图谱的层次粒化推荐方法
Hierarchical Granulation Recommendation Method Based on Knowledge Graph
计算机科学, 2022, 49(8): 64-69. https://doi.org/10.11896/jsjkx.210600111
[8] 魏恺轩, 付莹.
基于重参数化多尺度融合网络的高效极暗光原始图像降噪
Re-parameterized Multi-scale Fusion Network for Efficient Extreme Low-light Raw Denoising
计算机科学, 2022, 49(8): 120-126. https://doi.org/10.11896/jsjkx.220200179
[9] 孙奇, 吉根林, 张杰.
基于非局部注意力生成对抗网络的视频异常事件检测方法
Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection
计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[10] 沈祥培, 丁彦蕊.
多检测器融合的深度相关滤波视频多目标跟踪算法
Multi-detector Fusion-based Depth Correlation Filtering Video Multi-target Tracking Algorithm
计算机科学, 2022, 49(8): 184-190. https://doi.org/10.11896/jsjkx.210600004
[11] 檀莹莹, 王俊丽, 张超波.
基于图卷积神经网络的文本分类方法研究综述
Review of Text Classification Methods Based on Graph Convolutional Network
计算机科学, 2022, 49(8): 205-216. https://doi.org/10.11896/jsjkx.210800064
[12] 闫佳丹, 贾彩燕.
基于双图神经网络信息融合的文本分类方法
Text Classification Method Based on Information Fusion of Dual-graph Neural Network
计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042
[13] 武红鑫, 韩萌, 陈志强, 张喜龙, 李慕航.
监督和半监督学习下的多标签分类综述
Survey of Multi-label Classification Based on Supervised and Semi-supervised Learning
计算机科学, 2022, 49(8): 12-25. https://doi.org/10.11896/jsjkx.210700111
[14] 汪鸣, 彭舰, 黄飞虎.
基于多时间尺度时空图网络的交通流量预测模型
Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction
计算机科学, 2022, 49(8): 40-48. https://doi.org/10.11896/jsjkx.220100188
[15] 王剑, 彭雨琦, 赵宇斐, 杨健.
基于深度学习的社交网络舆情信息抽取方法综述
Survey of Social Network Public Opinion Information Extraction Based on Deep Learning
计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!