Computer Science ›› 2025, Vol. 52 ›› Issue (4): 262-270.doi: 10.11896/jsjkx.240100119

• Artificial Intelligence • Previous Articles     Next Articles

CGR-BERT-ZESHEL:Zero-shot Entity Linking Model with Chinese Features

PAN Jian1,2, WU Zhiwei1, LI Yanjun1   

  1. 1 College of Computer Science and Technology,Zhejiang University of Technology,Hangzhou 310023,China
    2 Zhijiang College of Zhejiang University of Technology,Shaoxing,Zhejiang 312030,China
  • Received:2024-01-15 Revised:2024-05-15 Online:2025-04-15 Published:2025-04-14
  • About author:PAN Jian,born in 1976,Ph.D,associate professor,postgraduate supervisor,is a member of CCF(No.26947M).His main research interests include natural language processing,intelligent information processing and Internet of Things.
  • Supported by:
    Natural Science Foundation of Zhejiang Province,China(LGF20F020015).

Abstract: Currently,the research on entity linking tasks is less on Chinese entity links,emerging entities and unknown entity links.Additionally,traditional BERT models ignore two crucial aspects of Chinese,namely glyphs and radicals,which provide important syntactic and semantic information for language understanding.To solve the above problems,this paper proposes a zero-shot entity linking model based on Chinese features called CGR-BERT-ZESHEL.Firstly,the model incorporates glyph and radical features by introducing visual image embedding and traditional character embedding,respectively,to enhance word vector features and mitigate the effect of out-of-vocabulary words.Then,a two-stage method of candidate entity generation and candidate entity ranking is used to obtain the results.Experimental results on the two datasets which include Hansel and CLEEK show that compared with the baseline model,the performance metric Recall@100 is improved by 17.49% and 7.34% in the candidate entity generation stage,and the performance metric accuracy is improved by 3.02% and 3.11% in the candidate entity ranking stage.Meanwhile,the proposed model also outperforms other baseline models in both Recall@100 and Accuracy metric.

Key words: Entity linking, Chinese zero-shot, BERT, Candidate entity generation, Candidate entity ranking

CLC Number: 

  • TP391
[1]BUNESCU R,PASCA M.Using encyclopedic knowledge fornamed entity disambiguation[C]//11th Conference of the European Chapter of the Association for Computational Linguistics.2006:9-16.
[2]DE C N,AZIZ W,TITOV I.Question answering by reasoningacross documents with graph convolutional networks[C]//2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics.Association for Computational Linguistics,2019:2306-2317.
[3]SHEN W,WANG J Y,HAN J W.Entity linking with a know-ledge base:issues,techniques,and solutions[J].IEEE Transactions on Knowledge and Data Engineering,2014,27(2):443-460.
[4]CURRY A C,PAPAIOANNOU I,SUGLIA A,et al.Alana v2:Entertaining and informative open-domain social dialogue using ontologies and entity linking [C]//Alexa Prize Proceedings.2018.
[5]LOGESWARAN L,CHANG M W,LEE K,et al.Zero-shot entity linking by reading entity descriptions[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.2019:3449-3460.
[6]DEVLIN J,CHANG M W,LEE K,et al.Bert:pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of NAACL-HLT.2019.
[7]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need [J].arXiv:1706.03762,2017.
[8]YANG Z L,DAI Z H,YANG Y M,et al.Xlnet:generalized autoregressive pretraining for language understanding [C]//Proceedings of the 33rd International Conference on Neural Information Processing Systems.2019:5753-5763.
[9]CLARK K,LUONG M T,LE Q V,et al.ELECTRA:pre-training text encoders as discriminators rather than generators[C]//International Conference on Learning Representations.2019.
[10]LAN Z Z,CHEN M D,GOODMAN S,et al.ALBERT:a liteBERT for self-supervised learning of language representations[C]//International Conference on Learning Representations.2019.
[11]LIU Y H,OTT M,GOYAL N,et al.Roberta:a robustly optimized bert pretraining approach [J].arXiv:1907.11692,2019.
[12]RADFORD A,WU J,CHILD R,et al.Language models are unsupervised multitask learner-s [J].OpenAI blog,2019,1(8):9.
[13]BROWN T,MANN B,RYDER N,et al.Language models are few-shot learners [J].Advances in Neural Information Proces-sing Systems,2020,33:1877-1901.
[14]RAE J W,BORGEAUD S,CAI T,et al.Scaling language mo-dels:methods,analysis & insights from training gopher [J].ar-Xiv:2112.11446,2021.
[15]WANG B.Mesh-Transformer-JAX:model-parallel implementation of transformer language model with JAX[EB/OL].https://github.com/kingoflolz/mesh-transformer-jax2021.
[16]LEWIS M,LIU Y H,GOYAL N,et al.BART:denoising se-quence-to-sequence pre-training for natural language generation,translation,and comprehension[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.2020:7871-7880.
[17]BAO H B,DONG L,WEI F R,et al.Unilmv2:pseudo-masked language models for unified language model pre-training[C]//International Conference on Machine Learning.PMLR,2020:642-652.
[18]ZHU J H,XIA Y C,WU L J,et al.Incorporating BERT intoneural machine translation [C]//International Conference on Learning Representations.2019.
[19]LI X Y,MENG Y X,SUN X F,et al.Is word segmentation necessary for deep learning of chinese representations?[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.2019:3242-3252.
[20]SUN Y,WANG S H,LI Y K,et al.Ernie:enhanced representation through knowledge integration [J].arXiv:1904.09223,2019.
[21]CUI Y M,CHE W X,LIU T,et al.Pre-training with wholeword masking for chinese bert[J].IEEE/ ACM Transactions on Audio,Speech,and Language Processing,2021,29:3504-3514.
[22]SUN Z J,LI X Y,SUN X F,et al.ChineseBERT:chinese pretraining enhanced by glyph and pinyin information[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing(Volume 1:Long Papers).2021:2065-2075.
[23]OUYANG L,WU J,JIANG X,et al.Training language models to follow instructions with human feedback [J].Advances in Neural Information Processing Systems,2022,35:27730-27744.
[24]ABDULLAH M,MADAIN A,JARARWEH Y.ChatGPT:fundamentals,applications and social impacts[C]//2022 Ninth International Conference on Social Networks Analysis,Management and Security(SNAMS).IEEE,2022:1-8.
[25]HE Z Y,LIU S J,LI M,et al.Learning entity representation forentity disambiguation[C]//Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics(Volume 2:Short Papers).2013:30-34.
[26]CHEN Y,TAN Y S,WU Q B,et al.TGCEL:a chinese entity linking method based on topic relation graph[C]//2017 6th International Conference on Computer Science and Network Technology(ICCSN T).IEEE,2017:226-230.
[27]OUYANG X Y,CHEN S D,ZHAO H,et al.A multi-crossmatching network for chinese named entity linking in short text[C]//Journal of Physics:Conference Series.IOP Publishing,2019.
[28]HUA X Y,LI L,HUA L F,et al.XREF:entity linking for chinese news comments with supplementary article reference[C]//Automated Knowledge Base Construction.2020.
[29]MURTY S,VERGA P,VILNIS L,et al.Hierarchical losses and new resources for fine-grained entity typing and linking[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics(Volume 1:Long Papers).2018:97-109.
[30]RADFORD A,NARASIMHAN K,SALIMA-NS T,et al.Improving language understanding with unsupervised learning[J].Citado,2018,17:1-12.
[31]YAMADA I,SHINDO H,TAKEDA H,et al.Joint learning of the embedding of words and entities for named entity disambiguation[C]//20th SIGNLL Conference on Computational Natural Language Learning,CoNLL 2016.Association for Computa-tional Linguistics(ACL),2016:250-259.
[32]ROBERTSON S,ZARAGOZA H.The probabilistic relevanceframework:BM25 and beyond[J].Foundations and Trends? in Information Retrieval,2009,3(4):333-389.
[33]WIATRAK M,ARVANITI E,BRAYNE A,et al.Proxy-based zero-shot entity linking by effective candidate retrieval[C]//Proceedings of the 13th International Workshop on Health Text Mining and Information Analysis(LOUHI).2022:87-99.
[34]XU Z R,CHEN Y L,SHI S B,et al.Enhancing entity linking with contextualized entity embeddings [C]//CCF International Conference on Natural Language Processing and Chinese Computing.Cham:Springer Nature Switzerland,2022:228-239.
[35]SEVGILI Ö,SHELMANOV A,ARKHIPOV M,et al.Neuralentity linking:a survey of models based on deep learning [J].Semantic Web,2022,13(3):527-570.
[36]RISTOSKI P,LIN Z Z,ZHOU Q Z.KG-ZESHEL:knowledge graph-enhanced zero-shot entity linking[C]//Proceedings of the 11th on Knowledge Capture Conference.2021:49-56.
[37]XU Z R,SHAN Z F,LI Y X,et al.Hansel:a chinese few-shot and zero-shot entity linking benchmark[C]//Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining.2023:832-840.
[38]HUANG S J,WANG B B,QIN L B,et al.Improving few-shot and zero-shot entity linking with coarse-to-fine lexicon-based retriever[C]//CCF International Conference on Natural Language Processing and Chinese Computing.Cham:Springer Nature Switzerland,2023:245-256.
[39]ZHOU H Y,SUN C J,LIN L,et al.ERNIE-AT-CEL:a chinese few-shot emerging entity linking model based on ERNIE and adversarial training[C]//CCF International Conference on Natural Language Processing and Chinese Computing.Cham:Springer Nature Switzerland,2023:48-56.
[40]LI Y X,ZHAO Y,HU B T,et al.Glyphcrm:bidirectional encoder representation for chinese character with its glyph [J].arXiv:2107.00395,2021.
[41]HUMEAU S,SHUSTER K,LACHAUX M A,et al.Poly-en-coders:architectures and pre-training strategies for fast and accurate multi-sentence scoring[C]//International Conference on Learning Representations.2019.
[42]ZENG W X,ZHAO X,TANG J Y,et al.Cleek:a chinese long-text corpus for entity linking[C]//Proceedings of the 12th Language Resources and Evaluation Conference.2020:2026-2035.
[43]GONG S,XIONG X,LI S,et al.Chinese entity linking with two-stage pre-training transformer encoders[C]//2022 International Conference on Machine Learning and Knowledge Engineering(MLKE).IEEE,2022:288-293.
[44]CUI Y,CHE W,LIU T,et al.Pre-training with whole word masking for chinese bert[J].IEEE/ACM Transactions on Audio,Speech,and Language Processing,2021,29:3504-3514.
[1] TANG Ruiqi, XIAO Ting, CHI Ziqiu, WANG Zhe. Few-shot Image Classification Based on Pseudo-label Dependence Enhancement and NoiseInterferenceReduction [J]. Computer Science, 2024, 51(8): 152-159.
[2] JIANG Haoda, ZHAO Chunlei, CHEN Han, WANG Chundong. Construction Method of Domain Sentiment Lexicon Based on Improved TF-IDF and BERT [J]. Computer Science, 2024, 51(6A): 230800011-9.
[3] YANG Junzhe, SONG Ying, CHEN Yifei. Text Emotional Analysis Model Fusing Theme Characteristics [J]. Computer Science, 2024, 51(6A): 230600111-8.
[4] CHEN Bingting, ZOU Weiqin, CAI Biyu, LIU Wenjie. Bug Report Severity Prediction Based on Fine-tuned Embedding Model with Domain Knowledge [J]. Computer Science, 2024, 51(6A): 230400068-7.
[5] YANG Binxia, LUO Xudong, SUN Kaili. Recent Progress on Machine Translation Based on Pre-trained Language Models [J]. Computer Science, 2024, 51(6A): 230700112-8.
[6] LI Minzhe, YIN Jibin. TCM Named Entity Recognition Model Combining BERT Model and Lexical Enhancement [J]. Computer Science, 2024, 51(6A): 230900030-6.
[7] MENG Xiangfu, REN Quanying, YANG Dongshen, LI Keqian, YAO Keyu, ZHU Yan. Literature Classification of Individual Reports of Adverse Drug Reactions Based on BERT and CNN [J]. Computer Science, 2024, 51(6A): 230400049-6.
[8] CHEN Haoyang, ZHANG Lei. Very Short Texts Hierarchical Classification Combining Semantic Interpretation and DeBERTa [J]. Computer Science, 2024, 51(5): 250-257.
[9] YAN Yintong, YU Lu, WANG Taiyan, LI Yuwei, PAN Zulie. Study on Binary Code Similarity Detection Based on Jump-SBERT [J]. Computer Science, 2024, 51(5): 355-362.
[10] ZHENG Qijian, LIU Feng. BEML:A Blended Learning Analysis Paradigm for Hidden Space Representation of Commodities [J]. Computer Science, 2024, 51(11A): 240300150-6.
[11] LU Jiawei, LU Shida, LIU Sisi, WU Chengrong. Online Log Parsing Method Based on Bert and Adaptive Clustering [J]. Computer Science, 2024, 51(11): 65-72.
[12] LIU Yingying, YANG Qiuhui, YAO Bangguo, LIU Qiaoyun. Study on REST API Test Case Generation Method Based on Dependency Model [J]. Computer Science, 2023, 50(9): 101-107.
[13] ZHAO Jiangjiang, WANG Yang, XU Yingying, GAO Yang. Extractive Automatic Summarization Model Based on Knowledge Distillation [J]. Computer Science, 2023, 50(6A): 210300179-7.
[14] LUO Liang, CHENG Chunling, LIU Qian, GUI Yaocheng. Answer Selection Model Based on MLP and Semantic Matrix [J]. Computer Science, 2023, 50(5): 270-276.
[15] LI Binghui, FANG Huan, MEI Zhenhui. Interpretable Repair Method for Event Logs Based on BERT and Weak Behavioral Profiles [J]. Computer Science, 2023, 50(5): 38-51.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!