FMNN:融合多神经网络的文本分类模型

doi:10.11896/jsjkx.210200090

Abstract

Abstract: Text classification is a basic and important task in natural language processing.Most of the text classification methods based on deep learning only focus on a single model structure.The single structure lacks the ability to simultaneously capture and utilize both global and local semantic features.Besides,the deepening of the network will lose more semantic information.In order to overcome the above problems,a text classification model FMNN which is a text classification model fused with multiple neural network is proposed in this paper.The model combines the performances of BERT,RNN,CNN and Attention while minimizing the network depth.BERT is used as the embedding layer to obtain the matrix representation of the text.BiLSTM and Attention are used to jointly extract the global semantic features of the text.CNN is used to extract the local semantic features of the text at multiple granularities.The global semantic features and local semantic features are applied to the softmax classifier respectively.The results are finally fused by arithmetic average.The experimental results on three public data sets and one judicial data set show that the proposed FMNN model achieves higher accuracy rate,and the accuracy rate on the judicial data set reaches 90.31%,which proves that the model has good practical value.

Key words: Deep learning, Fusion, Global semantic features, Local semantic features, Semantic loss, Text classification

CLC Number:

TP391

DENG Wei-bin, ZHU Kun, LI Yun-bo, HU Feng. FMNN:Text Classification Model Fused with Multiple Neural Networks[J].Computer Science, 2022, 49(3): 281-287.

References

[1]TAI K S,SOCHER R,MANNING C D.Improved semantic rep-resentations from tree-structured long short-term memory network[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th Internatio-nal Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing.ACL,2015:1556-1566.
[2]KALCHBRENNER N,GREFENSTETTE E,BLUNSOM P.A convolutional neural network for modelling sentences[C]//Proceedings of 52th Annual Meeting of the Association for Computational Linguistics.ACL,2014:1-11.
[3]KIM Y.Convolutional neural networks for sentence classification[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing(EMNLP).2014:1746-1751.
[4]JOHNSON R,ZHANG T.Deep pyramid convolutional neuralnetworks for text categorization[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.2017:526-570.
[5]CONNEAU A,SCHWENK H,BARRAULT L,et al.VeryDeep Convolutional Networks for Text Classification[C]//Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics.2017:1107-1116.
[6]LAI S,XU L,LIU K,et al.Recurrent convolutional neural networks for text classification[C]//Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence.2015:2267-2273.
[7]LIU P,QIU X,HUANG X.Recurrent neural network for text classification with multi-task learning[C]//Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence.2016:2873-2879.
[8]YANG Z,YANG D,DYER C,et al.Hierarchical attention networks for document classification[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2016:1480-1489.
[9]VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[C]//Advances in Neural Information Processing Systems.2017:5998-6008.
[10]KIM S,KANG I,KWAK N.Semantic sentence matching with densely-connected recurrent and co-attentive information[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2019:6586-6593.
[11]MIKOLOV T,CHEN K,CORRADO G,et al.Efficient estimation of word representations in vector space[J].arXiv:1301.3781,2013.
[12]PENNINGTON J,SOCHER R,MANNINGC D.Glove:Global vectors for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Proces-sing (EMNLP).2014:1532-1543.
[13]RADFORD A,WU J,CHILD R,et al.Language models are unsupervised multitask learners[J].OpenAI Blog,2019,1(8):9.
[14]DEVLIN J,CHANG M W,LEE K,et al.BERT:Pre-training of Deep Bidirectional Transformers for Language Understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2019:4171-4186.
[15]LI K Y,CHEN Y,NIU S Z.Social E-commerce Text Classification Algorithm Based on BERT[J].Computer Science,2021,48(2):87-92.
[16]RASMY L,XIANG Y,XIE Z Q,et al.Med-BERT:pretrainedcontextualized embeddings on large-scale structured electronic health records for disease prediction[J].NPJ Digital Medicine,2021,4(1):1-13.
[17]WENG X F,ZHAO J H,JIANG C X,et al.Research on sentiment classification of futures predictive texts based on BERT[J/OL].Computing,2021.https://doi.org/10.1007/s00607-021-00989-9.
[18]CUI Y,CHE W,LIU T,et al.Pre-training with whole wordmasking for chinese bert[J].arXiv:1906.08101,2019.
[19]LAN Z,CHEN M,GOODMAN S,et al.Albert:A lite bert for self-supervised learning of language representations[C]//Proceedings of the 8th International Conference on Learning Representations.ICLR,2020:1-17.
[20]JOSHI M,CHEN D,LIU Y,et al.Spanbert:Improving pre-training by representing and predicting spans[J].Transactions of the Association for Computational Linguistics,2020,8:64-77.
[21]SOCHER R,PERELYGIN A,WU J,et al.Recursive deep mo-dels for semantic compositionality over a sentiment treebank[C]//Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing.2013:1631-1642.
[22]ZHANG X,ZHAO J,LE C Y.Character-level convolutional networks for text classification[J].Advances in Neural Information Processing Systems,2015,28:649-657.
[23]DIAO Q,QIU M,WU C Y,et al.Jointly modeling aspects,ra-tings and sentiments for movie recommendation (JMARS)[C]//Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2014:193-202.
[24]CUI Y,CHE W,LIU T,et al.Revisiting Pre-Trained Models for Chinese Natural Language Processing[J].arXiv:2004.13922,2020.
[25]JOULIN A,GRAVE É,BOJANOWSKI P,et al.Bag of Tricksfor Efficient Text Classification[C]//Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics,Short Papers.2017:427-431.
[26]ZHOU P,SHI W,TIAN J,et al.Attention-based bidirectionallong short-term memory networks for relation classification[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Short Papers).2016:207-212.
[27]LIU G,GUO J.Bidirectional LSTM with attention mechanism and convolutional layer for text classification[J].Neurocompu-ting,2019,337:325-338.

Related Articles 15

[1]	CAO Xiao-wen, LIANG Mei-yu, LU Kang-kang. Fine-grained Semantic Reasoning Based Cross-media Dual-way Adversarial Hashing Learning Model [J]. Computer Science, 2022, 49(9): 123-131.
[2]	ZHOU Le-yuan, ZHANG Jian-hua, YUAN Tian-tian, CHEN Sheng-yong. Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion [J]. Computer Science, 2022, 49(9): 155-161.
[3]	XU Yong-xin, ZHAO Jun-feng, WANG Ya-sha, XIE Bing, YANG Kai. Temporal Knowledge Graph Representation Learning [J]. Computer Science, 2022, 49(9): 162-171.
[4]	RAO Zhi-shuang, JIA Zhen, ZHANG Fan, LI Tian-rui. Key-Value Relational Memory Networks for Question Answering over Knowledge Graph [J]. Computer Science, 2022, 49(9): 202-207.
[5]	WU Zi-yi, LI Shao-mei, JIANG Meng-han, ZHANG Jian-peng. Ontology Alignment Method Based on Self-attention [J]. Computer Science, 2022, 49(9): 215-220.
[6]	TANG Ling-tao, WANG Di, ZHANG Lu-fei, LIU Sheng-yun. Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy [J]. Computer Science, 2022, 49(9): 297-305.
[7]	WU Hong-xin, HAN Meng, CHEN Zhi-qiang, ZHANG Xi-long, LI Mu-hang. Survey of Multi-label Classification Based on Supervised and Semi-supervised Learning [J]. Computer Science, 2022, 49(8): 12-25.
[8]	WANG Jian, PENG Yu-qi, ZHAO Yu-fei, YANG Jian. Survey of Social Network Public Opinion Information Extraction Based on Deep Learning [J]. Computer Science, 2022, 49(8): 279-293.
[9]	HAO Zhi-rong, CHEN Long, HUANG Jia-cheng. Class Discriminative Universal Adversarial Attack for Text Classification [J]. Computer Science, 2022, 49(8): 323-329.
[10]	JIANG Meng-han, LI Shao-mei, ZHENG Hong-hao, ZHANG Jian-peng. Rumor Detection Model Based on Improved Position Embedding [J]. Computer Science, 2022, 49(8): 330-335.
[11]	QIN Qi-qi, ZHANG Yue-qin, WANG Run-ze, ZHANG Ze-hua. Hierarchical Granulation Recommendation Method Based on Knowledge Graph [J]. Computer Science, 2022, 49(8): 64-69.
[12]	WEI Kai-xuan, FU Ying. Re-parameterized Multi-scale Fusion Network for Efficient Extreme Low-light Raw Denoising [J]. Computer Science, 2022, 49(8): 120-126.
[13]	SUN Qi, JI Gen-lin, ZHANG Jie. Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection [J]. Computer Science, 2022, 49(8): 172-177.
[14]	SHEN Xiang-pei, DING Yan-rui. Multi-detector Fusion-based Depth Correlation Filtering Video Multi-target Tracking Algorithm [J]. Computer Science, 2022, 49(8): 184-190.
[15]	TAN Ying-ying, WANG Jun-li, ZHANG Chao-bo. Review of Text Classification Methods Based on Graph Convolutional Network [J]. Computer Science, 2022, 49(8): 205-216.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

FMNN:Text Classification Model Fused with Multiple Neural Networks

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0