计算机科学 ›› 2022, Vol. 49 ›› Issue (3): 281-287.doi: 10.11896/jsjkx.210200090
所属专题: 自然语言处理 虚拟专题
邓维斌, 朱坤, 李云波, 胡峰
DENG Wei-bin, ZHU Kun, LI Yun-bo, HU Feng
摘要: 文本分类是自然语言处理中一项基本且重要的任务。基于深度学习的文本分类方法大多只针对单一的模型结构进行深入研究,这种单一的结构缺乏同时捕获并利用全局语义特征与局部语义特征的能力,且网络的加深会损失更多的语义信息。对此,提出了一种融合多神经网络的文本分类模型FMNN(A Text Classification Model Fused with Multiple Neural Network),FMNN在最大限度减小网络深度的同时,融合了BERT,RNN,CNN和Attention等神经网络模型的特性。用BERT作为嵌入层获得文本的矩阵表示,用BiLSTM和Attention联合提取文本的全局语义特征,用CNN提取文本多个粒度下的局部语义特征,将全局语义特征和局部语义特征分别作用于softmax分类器,最后采用算术平均的方式对结果进行融合。在3个公开数据集和1个司法数据集上的实验结果表明,FMNN模型实现了更高的文本分类准确率,其中在司法数据集上的准确率达到了90.31%,证明了该模型具有较好的实用价值。
中图分类号:
[1]TAI K S,SOCHER R,MANNING C D.Improved semantic rep-resentations from tree-structured long short-term memory network[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th Internatio-nal Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing.ACL,2015:1556-1566. [2]KALCHBRENNER N,GREFENSTETTE E,BLUNSOM P.A convolutional neural network for modelling sentences[C]//Proceedings of 52th Annual Meeting of the Association for Computational Linguistics.ACL,2014:1-11. [3]KIM Y.Convolutional neural networks for sentence classification[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing(EMNLP).2014:1746-1751. [4]JOHNSON R,ZHANG T.Deep pyramid convolutional neuralnetworks for text categorization[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.2017:526-570. [5]CONNEAU A,SCHWENK H,BARRAULT L,et al.VeryDeep Convolutional Networks for Text Classification[C]//Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics.2017:1107-1116. [6]LAI S,XU L,LIU K,et al.Recurrent convolutional neural networks for text classification[C]//Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence.2015:2267-2273. [7]LIU P,QIU X,HUANG X.Recurrent neural network for text classification with multi-task learning[C]//Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence.2016:2873-2879. [8]YANG Z,YANG D,DYER C,et al.Hierarchical attention networks for document classification[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2016:1480-1489. [9]VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[C]//Advances in Neural Information Processing Systems.2017:5998-6008. [10]KIM S,KANG I,KWAK N.Semantic sentence matching with densely-connected recurrent and co-attentive information[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2019:6586-6593. [11]MIKOLOV T,CHEN K,CORRADO G,et al.Efficient estimation of word representations in vector space[J].arXiv:1301.3781,2013. [12]PENNINGTON J,SOCHER R,MANNINGC D.Glove:Global vectors for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Proces-sing (EMNLP).2014:1532-1543. [13]RADFORD A,WU J,CHILD R,et al.Language models are unsupervised multitask learners[J].OpenAI Blog,2019,1(8):9. [14]DEVLIN J,CHANG M W,LEE K,et al.BERT:Pre-training of Deep Bidirectional Transformers for Language Understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2019:4171-4186. [15]LI K Y,CHEN Y,NIU S Z.Social E-commerce Text Classification Algorithm Based on BERT[J].Computer Science,2021,48(2):87-92. [16]RASMY L,XIANG Y,XIE Z Q,et al.Med-BERT:pretrainedcontextualized embeddings on large-scale structured electronic health records for disease prediction[J].NPJ Digital Medicine,2021,4(1):1-13. [17]WENG X F,ZHAO J H,JIANG C X,et al.Research on sentiment classification of futures predictive texts based on BERT[J/OL].Computing,2021.https://doi.org/10.1007/s00607-021-00989-9. [18]CUI Y,CHE W,LIU T,et al.Pre-training with whole wordmasking for chinese bert[J].arXiv:1906.08101,2019. [19]LAN Z,CHEN M,GOODMAN S,et al.Albert:A lite bert for self-supervised learning of language representations[C]//Proceedings of the 8th International Conference on Learning Representations.ICLR,2020:1-17. [20]JOSHI M,CHEN D,LIU Y,et al.Spanbert:Improving pre-training by representing and predicting spans[J].Transactions of the Association for Computational Linguistics,2020,8:64-77. [21]SOCHER R,PERELYGIN A,WU J,et al.Recursive deep mo-dels for semantic compositionality over a sentiment treebank[C]//Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing.2013:1631-1642. [22]ZHANG X,ZHAO J,LE C Y.Character-level convolutional networks for text classification[J].Advances in Neural Information Processing Systems,2015,28:649-657. [23]DIAO Q,QIU M,WU C Y,et al.Jointly modeling aspects,ra-tings and sentiments for movie recommendation (JMARS)[C]//Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2014:193-202. [24]CUI Y,CHE W,LIU T,et al.Revisiting Pre-Trained Models for Chinese Natural Language Processing[J].arXiv:2004.13922,2020. [25]JOULIN A,GRAVE É,BOJANOWSKI P,et al.Bag of Tricksfor Efficient Text Classification[C]//Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics,Short Papers.2017:427-431. [26]ZHOU P,SHI W,TIAN J,et al.Attention-based bidirectionallong short-term memory networks for relation classification[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Short Papers).2016:207-212. [27]LIU G,GUO J.Bidirectional LSTM with attention mechanism and convolutional layer for text classification[J].Neurocompu-ting,2019,337:325-338. |
[1] | 曹晓雯, 梁美玉, 鲁康康. 基于细粒度语义推理的跨媒体双路对抗哈希学习模型 Fine-grained Semantic Reasoning Based Cross-media Dual-way Adversarial Hashing Learning Model 计算机科学, 2022, 49(9): 123-131. https://doi.org/10.11896/jsjkx.220600011 |
[2] | 周乐员, 张剑华, 袁甜甜, 陈胜勇. 多层注意力机制融合的序列到序列中国连续手语识别和翻译 Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion 计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026 |
[3] | 徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺. 时序知识图谱表示学习 Temporal Knowledge Graph Representation Learning 计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204 |
[4] | 饶志双, 贾真, 张凡, 李天瑞. 基于Key-Value关联记忆网络的知识图谱问答方法 Key-Value Relational Memory Networks for Question Answering over Knowledge Graph 计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277 |
[5] | 吴子仪, 李邵梅, 姜梦函, 张建朋. 基于自注意力模型的本体对齐方法 Ontology Alignment Method Based on Self-attention 计算机科学, 2022, 49(9): 215-220. https://doi.org/10.11896/jsjkx.210700190 |
[6] | 汤凌韬, 王迪, 张鲁飞, 刘盛云. 基于安全多方计算和差分隐私的联邦学习方案 Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy 计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108 |
[7] | 秦琪琦, 张月琴, 王润泽, 张泽华. 基于知识图谱的层次粒化推荐方法 Hierarchical Granulation Recommendation Method Based on Knowledge Graph 计算机科学, 2022, 49(8): 64-69. https://doi.org/10.11896/jsjkx.210600111 |
[8] | 魏恺轩, 付莹. 基于重参数化多尺度融合网络的高效极暗光原始图像降噪 Re-parameterized Multi-scale Fusion Network for Efficient Extreme Low-light Raw Denoising 计算机科学, 2022, 49(8): 120-126. https://doi.org/10.11896/jsjkx.220200179 |
[9] | 孙奇, 吉根林, 张杰. 基于非局部注意力生成对抗网络的视频异常事件检测方法 Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection 计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061 |
[10] | 沈祥培, 丁彦蕊. 多检测器融合的深度相关滤波视频多目标跟踪算法 Multi-detector Fusion-based Depth Correlation Filtering Video Multi-target Tracking Algorithm 计算机科学, 2022, 49(8): 184-190. https://doi.org/10.11896/jsjkx.210600004 |
[11] | 檀莹莹, 王俊丽, 张超波. 基于图卷积神经网络的文本分类方法研究综述 Review of Text Classification Methods Based on Graph Convolutional Network 计算机科学, 2022, 49(8): 205-216. https://doi.org/10.11896/jsjkx.210800064 |
[12] | 闫佳丹, 贾彩燕. 基于双图神经网络信息融合的文本分类方法 Text Classification Method Based on Information Fusion of Dual-graph Neural Network 计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042 |
[13] | 武红鑫, 韩萌, 陈志强, 张喜龙, 李慕航. 监督和半监督学习下的多标签分类综述 Survey of Multi-label Classification Based on Supervised and Semi-supervised Learning 计算机科学, 2022, 49(8): 12-25. https://doi.org/10.11896/jsjkx.210700111 |
[14] | 汪鸣, 彭舰, 黄飞虎. 基于多时间尺度时空图网络的交通流量预测模型 Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction 计算机科学, 2022, 49(8): 40-48. https://doi.org/10.11896/jsjkx.220100188 |
[15] | 王剑, 彭雨琦, 赵宇斐, 杨健. 基于深度学习的社交网络舆情信息抽取方法综述 Survey of Social Network Public Opinion Information Extraction Based on Deep Learning 计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099 |
|