Computer Science ›› 2025, Vol. 52 ›› Issue (6A): 240600121-8.doi: 10.11896/jsjkx.240600121

• Artificial Intelligence • Previous Articles     Next Articles

MacBERT Based Chinese Named Entity Recognition Fusion with Dependent Syntactic Information and Multi-view Lexical Information

LI Daicheng, LI Han, LIU Zheyu, GONG Shiheng   

  1. School of Electronic and Information Engineering,Liaoning University of Technology,Jinzhou,Liaoning 121000,China
  • Online:2025-06-16 Published:2025-06-12
  • About author:LI Daicheng,born in 1999,postgraduate.His main research interests include natural language processing and know-ledge graphs.
    LI Han,born in 1984,Ph.D,associate professor.His main research interests include complex networks and embedded systems.
  • Supported by:
    2024 Fundamental Research Project of the Educational Department of Liaoning Province,Liaoning Province “Ranking” Project(2022JH1/10400009),Teaching Reform Research Project of Liaoning University of Technology(xjg2022033) and Liaoning Province Livelihood Science and Technology Plan(2021JH2/10200002).

Abstract: In the Chinese environment of open entity types and complex entity structure,the Chinese named entity recognition(CNE) task encounter obvious issues,such as entity boundary judgment errors and low accuracy of entity classification.In order to solve above issues,a Chinese named entity recognition model called MacBERT-SDI-ML has been proposed,which based on the MacBERT pre-training model using characters as encoding units.Firstly,in order to extract richer Chinese semantic features and improve the accuracy of entity recognition,the model adopts MacBERT(the whole word masking for chinese BERT) as the embedding layer.Secondly,in order to further enhance entity representation characteristics and improve the accuracy of entity classification,the model utilizes a dependency syntactic information parser(SDIP) to efficiently extract more abundant dependency information of entities and integrate it into character representation.Additionally,considering the potential variation in character positions across different words,the model incorporates a multi-view lexical information fusion component(MLIF) based on self-attention mechanism to further enhance the boundary features of character representation and improve the accuracy of boundary judgment.Finally,experiment is conducted on the Weibo,OntoNotes and resume datasets,and the results show that the F1 value of the proposed model reaches 72.97%,86.56% and 98.45%,respectively.

Key words: CNER, MacBERT, Lexical information, Dependency information, Pre-training model, Self-attention mechanism

CLC Number: 

  • TP391
[1]DING J P,LI W J,LIU X Y,et al.Review on named entity re-cognition [J].Computer Engineering and Science,2024,46(7):1296-1310.
[2]LE P,TITOV I.Improving Entity Linking by Modeling LatentRelations between Mentions[C]//Proceedings of the 56th Annual Meeting on Association for Computational Linguistics.2018:1595-1604.
[3]HOU F,WANG R,HE J,et al.Improving Entity Linkingthrough Semantic Reinforced Entity Embeddings[C]//Procee-dings of the 58th Annual Meeting on Association for Computational Linguistics.2020:6843-6848.
[4]GU Y,QU X,WANG Z,et al.Read,Retrospect,Select:AnMRC Framework to Short Text Entity Linking[C]//Procee-dings of the AAAI Conference on Artificial Intelligence.2021:12920-12928.
[5]JI S X,PAN S R,ERIK C,et al.A Survey on KnowledgeGraphs:Representation,Acquisition,and Applications[J].ar-Xiv:2002.00388V4,2020.
[6]CHAWLA A,MULAY N,BISHNOI V,et al.KARL-Trans-NER:Knowledge Aware Representation Learning for Named Entity Recognition using Transformers[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2021:15436-15445.
[7]LIU A T,XIAO W,ZHU H,et al.QaNER:Prompting question answering models for few-shot named entity recognition[J].arXiv:2203.01543,2022.
[8]CUI Y,CHE W,LIU T,et al.Pre-training with whole wordmasking for chinese bert[J].IEEE/ACM Transactions on Audio,Speech,and Language Processing,2021,29:3504-3514.
[9]GAO G Z,LI N,HUA Y P,et al.Named entity recognition in oiland gas field based on BERT-BiLSTM-CRF model [J].Journal of Yangtze University(Natural Science Edition),2024,21(1):57-65.
[10]JIANG C,WANG D B.Research on Entity Knowledge Automatic Extraction of public health emergency major infectious disease events based on BERT [J].Scientific and Technological Information Research,2021,3(2):23-35.
[11] WU S,YANG X Z,HE L,et al.Research on entity recognition of fine-grained ancient books based on syntactic features and Bert-BiLSTM-MHA-CRF[J/OL].[2024-05-22].http://kns.cnki.net/kcms/detail/10.1478.G2.20240313.1314.004.html.
[12]LAN Z Z,CHEN M D,GOODMAN S,et al.ALBERT:A Lite BERT for Self-supervised Learning of Language Representations[C]//CoRR.2019.
[13] CLARK K,LUONG M T,LE Q V,et al.Electra:Pre-training text encoders as discriminators rather than generators[C]//ICLR.2020.
[14] LIU Y H,MYLE O,NAMAN G,et al.Roberta:A robustly optimized bert pretraining approach[C]//CoRR.2019.
[15] LU X T,SUN L P,LING C,et al.Chinese electronic medical record named entity recognition with Pinyin and parts-of-speech features [J/OL].2024-05-24].http://kns.cnki.net/kcms/detail/21.1106.TP.20240228.1116.013.html.
[16]HE J,WANG H.Chinese named entity recognition and wordsegmentation based on character[C]//Proceedings of the Sixth SIGHAN Workshop on Chinese Language Processing.2008.
[17]LI Y,LUE S G D,JIANG W L.Chinese named entity recognition with External knowledge and location information [J/OL].[2024-05-24].http://kns.cnki.net/kcms/detail/11.2127.TP.20240129.1202.022.html.
[18]LIU J,SUN M,ZHANG W,et al.DAE-NER:Dual-channel Attention Enhancement for Chinese Named Entity Recognition[J].Computer Speech & Language,2024,85(Apr.):101581.
[19]WU S,SONG X N,FENG Z.MECT:Multi-metadata embedding based cross-transformer for Chinese named entity recognition[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing.2021:1529-1539.
[20]ZHU P,CHENG D,YANG F,et al.Improving Chinese named entity recognition by large-scale syntactic dependency graph[J].IEEE/ACM Transactions on Audio,Speech,and Language Processing,2022,30:979-991.
[21]YANG D,LIAN T,ZHENG W,et al.Enriching Word Information Representation for Chinese Cybersecurity Named Entity Recognition[J].Neural Processing Letters,2023,55(6):7689-7707.
[22]WANG P,WANG Z,ZHANG X,et al.Enhanced Named Entity Recognition through Joint Dependency Parsing[C]//2023 International Joint Conference on Neural Networks(IJCNN).IEEE,2023:1-8.
[23]ZHANG Y,YANG J.Chinese NER using lattice LSTM[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics.2018:1554-1564.
[24]ZHAO Z Y,ZHU J J,ZHANG Y X,et al.Chinese named entity recognition enhanced by dictionary knowledge integration based on Chinese context information [J/OL].[2024-05-26].https://doi.org/10.19907/j.0490-6756.2024.042001.
[25]CHEN Y Q,WU X L,ZHAN W T,et al.Chinese named entity recognition based on multi-feature fusion and attention mechanism [J/OL].[2024-05-26].http://kns.cnki.net/kcms/detail/21.1106.TP.20240226.1426.008.html.
[26]ZHANG H,WANG X,LIU J,et al.Chinese named entity recognition method for the finance domain based on enhanced features and pretrained language models[J].Information Sciences,2023,625:385-400.
[27]LIU J,LIU C,LI N,et al.LADA-trans-NER:adaptive efficient transformer for Chinese named entity recognition using lexicon-attention and data-augmentation[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2023:13236-13245.
[28]WU F,LIU J,WU C,et al.Neural Chinese named entity recognition via CNN-LSTM-CRF and joint training with word segmentation[C]//The World Wide Web Conference.2019:3342-3348.
[29]WEISCHEDEL R,PRADHAN S,RAMSHAW L,et al.On-tonotes release 4.0[Z].LDC2011T03,Philadelphia,Penn.:Linguistic Data Consortium,2011:17.
[30]PENG N,DREDZE M.Named entity recognition for chinese social media with jointly trained embeddings[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing.2015:548-554.
[1] HOU Zhexiao, LI Bicheng, CAI Bingyan, XU Yifei. High Quality Image Generation Method Based on Improved Diffusion Model [J]. Computer Science, 2025, 52(6A): 240500094-9.
[2] GUAN Xin, YANG Xueyong, YANG Xiaolin, MENG Xiangfu. Tumor Mutation Prediction Model of Lung Adenocarcinoma Based on Pathological [J]. Computer Science, 2025, 52(6A): 240700010-8.
[3] HAN Wei, JIANG Shujuan, ZHOU Wei. Patch Correctness Verification Method Based on CodeBERT and Stacking Ensemble Learning [J]. Computer Science, 2025, 52(1): 250-258.
[4] ZHANG Jian, LI Hui, ZHANG Shengming, WU Jie, PENG Ying. Review of Pre-training Methods for Visually-rich Document Understanding [J]. Computer Science, 2025, 52(1): 259-276.
[5] GUO Zhiqiang, GUAN Donghai, YUAN Weiwei. Word-Character Model with Low Lexical Information Loss for Chinese NER [J]. Computer Science, 2024, 51(8): 272-280.
[6] LI Jiaying, LIANG Yudong, LI Shaoji, ZHANG Kunpeng, ZHANG Chao. Study on Algorithm of Depth Image Super-resolution Guided by High-frequency Information ofColor Images [J]. Computer Science, 2024, 51(7): 197-205.
[7] LIU Xiaohu, CHEN Defu, LI Jun, ZHOU Xuwen, HU Shan, ZHOU Hao. Speaker Verification Network Based on Multi-scale Convolutional Encoder [J]. Computer Science, 2024, 51(6A): 230700083-6.
[8] QUE Yue, GAN Menghan, LIU Zhiwei. Object Detection with Receptive Field Expansion and Multi-branch Aggregation [J]. Computer Science, 2024, 51(6A): 230600151-6.
[9] ZHANG Lanxin, XIANG Ling, LI Xianze, CHEN Jinpeng. Intelligent Fault Diagnosis Method for Rolling Bearing Based on SAMNV3 [J]. Computer Science, 2024, 51(6A): 230700167-6.
[10] SHI Jiyun, ZHANG Chi, WANG Yuqiao, LUO Zhaojing, ZHANG Meihui. Generation of Structured Medical Reports Based on Knowledge Assistance [J]. Computer Science, 2024, 51(6): 317-324.
[11] ZHANG Zhiyuan, ZHANG Weiyan, SONG Yuqiu, RUAN Tong. Multilingual Event Detection Based on Cross-level and Multi-view Features Fusion [J]. Computer Science, 2024, 51(5): 208-215.
[12] ZHANG Feng, HUANG Shixin, HUA Qiang, DONG Chunru. Novel Image Classification Model Based on Depth-wise Convolution Neural Network andVisual Transformer [J]. Computer Science, 2024, 51(2): 196-204.
[13] REN Yuheng, ZHAO Yunfeng, WU Chuang. Deep Gait Recognition Network Based on Relative Position Encoding Transformer [J]. Computer Science, 2024, 51(11A): 240400064-6.
[14] XU Junwen, CHEN Zonglei, LI Tianrui, LI Chongshou. Time Series Prediction of Hybrid Neural Networks Based on Seasonal Decomposition [J]. Computer Science, 2024, 51(11A): 231200008-7.
[15] ZHOU Xueyang, FU Qiming, CHEN Jianping, LU You, WANG Yunzhe. Chemical-induced Disease Relation Extraction:Graph Reasoning Method Based on Evidence Focusing [J]. Computer Science, 2024, 51(10): 351-361.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!