计算机科学 ›› 2025, Vol. 52 ›› Issue (4): 262-270.doi: 10.11896/jsjkx.240100119

• 人工智能 • 上一篇    下一篇

CGR-BERT-ZESHEL:基于中文特征的零样本实体链接模型

潘建1,2, 吴志伟1, 李燕君1   

  1. 1 浙江工业大学计算机科学与技术学院 杭州 310023
    2 浙江工业大学之江学院 浙江 绍兴 312030
  • 收稿日期:2024-01-15 修回日期:2024-05-15 出版日期:2025-04-15 发布日期:2025-04-14
  • 通讯作者: 潘建(pj@zjut.edu.cn)
  • 基金资助:
    浙江省自然科学基金探索项目(LGF20F020015)

CGR-BERT-ZESHEL:Zero-shot Entity Linking Model with Chinese Features

PAN Jian1,2, WU Zhiwei1, LI Yanjun1   

  1. 1 College of Computer Science and Technology,Zhejiang University of Technology,Hangzhou 310023,China
    2 Zhijiang College of Zhejiang University of Technology,Shaoxing,Zhejiang 312030,China
  • Received:2024-01-15 Revised:2024-05-15 Online:2025-04-15 Published:2025-04-14
  • About author:PAN Jian,born in 1976,Ph.D,associate professor,postgraduate supervisor,is a member of CCF(No.26947M).His main research interests include natural language processing,intelligent information processing and Internet of Things.
  • Supported by:
    Natural Science Foundation of Zhejiang Province,China(LGF20F020015).

摘要: 目前,在实体链接任务的研究中,对中文实体链接、新兴实体与不知名实体链接的研究较少。此外,传统的BERT模型忽略了中文的两个关键方面,即字形和部首,这两者为语言理解提供了重要的语法和语义信息。针对以上问题,提出了一种基于中文特征的零样本实体链接模型CGR-BERT-ZESHEL。该模型首先通过引入视觉图像嵌入和传统字符嵌入,分别将字形特征和部首特征输入模型,从而增强词向量特征并缓解未登录词对模型性能的影响;然后采用候选实体生成和候选实体排序两阶段的方法得到实体链接的结果。在Hansel和CLEEK两个数据集上进行实验,结果表明,与基线模型相比,CGR-BERT-ZESHEL模型在候选实体生成阶段的性能指标Recall@100提高了17.49%和7.34%,在候选实体排序阶段的性能指标Accuracy提高了3.02%和3.11%;同时,在Recall@100和Accuracy指标上的性能均优于其他对比模型。

关键词: 实体链接, 中文零样本, BERT, 候选实体生成, 候选实体排序

Abstract: Currently,the research on entity linking tasks is less on Chinese entity links,emerging entities and unknown entity links.Additionally,traditional BERT models ignore two crucial aspects of Chinese,namely glyphs and radicals,which provide important syntactic and semantic information for language understanding.To solve the above problems,this paper proposes a zero-shot entity linking model based on Chinese features called CGR-BERT-ZESHEL.Firstly,the model incorporates glyph and radical features by introducing visual image embedding and traditional character embedding,respectively,to enhance word vector features and mitigate the effect of out-of-vocabulary words.Then,a two-stage method of candidate entity generation and candidate entity ranking is used to obtain the results.Experimental results on the two datasets which include Hansel and CLEEK show that compared with the baseline model,the performance metric Recall@100 is improved by 17.49% and 7.34% in the candidate entity generation stage,and the performance metric accuracy is improved by 3.02% and 3.11% in the candidate entity ranking stage.Meanwhile,the proposed model also outperforms other baseline models in both Recall@100 and Accuracy metric.

Key words: Entity linking, Chinese zero-shot, BERT, Candidate entity generation, Candidate entity ranking

中图分类号: 

  • TP391
[1]BUNESCU R,PASCA M.Using encyclopedic knowledge fornamed entity disambiguation[C]//11th Conference of the European Chapter of the Association for Computational Linguistics.2006:9-16.
[2]DE C N,AZIZ W,TITOV I.Question answering by reasoningacross documents with graph convolutional networks[C]//2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics.Association for Computational Linguistics,2019:2306-2317.
[3]SHEN W,WANG J Y,HAN J W.Entity linking with a know-ledge base:issues,techniques,and solutions[J].IEEE Transactions on Knowledge and Data Engineering,2014,27(2):443-460.
[4]CURRY A C,PAPAIOANNOU I,SUGLIA A,et al.Alana v2:Entertaining and informative open-domain social dialogue using ontologies and entity linking [C]//Alexa Prize Proceedings.2018.
[5]LOGESWARAN L,CHANG M W,LEE K,et al.Zero-shot entity linking by reading entity descriptions[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.2019:3449-3460.
[6]DEVLIN J,CHANG M W,LEE K,et al.Bert:pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of NAACL-HLT.2019.
[7]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need [J].arXiv:1706.03762,2017.
[8]YANG Z L,DAI Z H,YANG Y M,et al.Xlnet:generalized autoregressive pretraining for language understanding [C]//Proceedings of the 33rd International Conference on Neural Information Processing Systems.2019:5753-5763.
[9]CLARK K,LUONG M T,LE Q V,et al.ELECTRA:pre-training text encoders as discriminators rather than generators[C]//International Conference on Learning Representations.2019.
[10]LAN Z Z,CHEN M D,GOODMAN S,et al.ALBERT:a liteBERT for self-supervised learning of language representations[C]//International Conference on Learning Representations.2019.
[11]LIU Y H,OTT M,GOYAL N,et al.Roberta:a robustly optimized bert pretraining approach [J].arXiv:1907.11692,2019.
[12]RADFORD A,WU J,CHILD R,et al.Language models are unsupervised multitask learner-s [J].OpenAI blog,2019,1(8):9.
[13]BROWN T,MANN B,RYDER N,et al.Language models are few-shot learners [J].Advances in Neural Information Proces-sing Systems,2020,33:1877-1901.
[14]RAE J W,BORGEAUD S,CAI T,et al.Scaling language mo-dels:methods,analysis & insights from training gopher [J].ar-Xiv:2112.11446,2021.
[15]WANG B.Mesh-Transformer-JAX:model-parallel implementation of transformer language model with JAX[EB/OL].https://github.com/kingoflolz/mesh-transformer-jax2021.
[16]LEWIS M,LIU Y H,GOYAL N,et al.BART:denoising se-quence-to-sequence pre-training for natural language generation,translation,and comprehension[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.2020:7871-7880.
[17]BAO H B,DONG L,WEI F R,et al.Unilmv2:pseudo-masked language models for unified language model pre-training[C]//International Conference on Machine Learning.PMLR,2020:642-652.
[18]ZHU J H,XIA Y C,WU L J,et al.Incorporating BERT intoneural machine translation [C]//International Conference on Learning Representations.2019.
[19]LI X Y,MENG Y X,SUN X F,et al.Is word segmentation necessary for deep learning of chinese representations?[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.2019:3242-3252.
[20]SUN Y,WANG S H,LI Y K,et al.Ernie:enhanced representation through knowledge integration [J].arXiv:1904.09223,2019.
[21]CUI Y M,CHE W X,LIU T,et al.Pre-training with wholeword masking for chinese bert[J].IEEE/ ACM Transactions on Audio,Speech,and Language Processing,2021,29:3504-3514.
[22]SUN Z J,LI X Y,SUN X F,et al.ChineseBERT:chinese pretraining enhanced by glyph and pinyin information[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing(Volume 1:Long Papers).2021:2065-2075.
[23]OUYANG L,WU J,JIANG X,et al.Training language models to follow instructions with human feedback [J].Advances in Neural Information Processing Systems,2022,35:27730-27744.
[24]ABDULLAH M,MADAIN A,JARARWEH Y.ChatGPT:fundamentals,applications and social impacts[C]//2022 Ninth International Conference on Social Networks Analysis,Management and Security(SNAMS).IEEE,2022:1-8.
[25]HE Z Y,LIU S J,LI M,et al.Learning entity representation forentity disambiguation[C]//Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics(Volume 2:Short Papers).2013:30-34.
[26]CHEN Y,TAN Y S,WU Q B,et al.TGCEL:a chinese entity linking method based on topic relation graph[C]//2017 6th International Conference on Computer Science and Network Technology(ICCSN T).IEEE,2017:226-230.
[27]OUYANG X Y,CHEN S D,ZHAO H,et al.A multi-crossmatching network for chinese named entity linking in short text[C]//Journal of Physics:Conference Series.IOP Publishing,2019.
[28]HUA X Y,LI L,HUA L F,et al.XREF:entity linking for chinese news comments with supplementary article reference[C]//Automated Knowledge Base Construction.2020.
[29]MURTY S,VERGA P,VILNIS L,et al.Hierarchical losses and new resources for fine-grained entity typing and linking[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics(Volume 1:Long Papers).2018:97-109.
[30]RADFORD A,NARASIMHAN K,SALIMA-NS T,et al.Improving language understanding with unsupervised learning[J].Citado,2018,17:1-12.
[31]YAMADA I,SHINDO H,TAKEDA H,et al.Joint learning of the embedding of words and entities for named entity disambiguation[C]//20th SIGNLL Conference on Computational Natural Language Learning,CoNLL 2016.Association for Computa-tional Linguistics(ACL),2016:250-259.
[32]ROBERTSON S,ZARAGOZA H.The probabilistic relevanceframework:BM25 and beyond[J].Foundations and Trends? in Information Retrieval,2009,3(4):333-389.
[33]WIATRAK M,ARVANITI E,BRAYNE A,et al.Proxy-based zero-shot entity linking by effective candidate retrieval[C]//Proceedings of the 13th International Workshop on Health Text Mining and Information Analysis(LOUHI).2022:87-99.
[34]XU Z R,CHEN Y L,SHI S B,et al.Enhancing entity linking with contextualized entity embeddings [C]//CCF International Conference on Natural Language Processing and Chinese Computing.Cham:Springer Nature Switzerland,2022:228-239.
[35]SEVGILI Ö,SHELMANOV A,ARKHIPOV M,et al.Neuralentity linking:a survey of models based on deep learning [J].Semantic Web,2022,13(3):527-570.
[36]RISTOSKI P,LIN Z Z,ZHOU Q Z.KG-ZESHEL:knowledge graph-enhanced zero-shot entity linking[C]//Proceedings of the 11th on Knowledge Capture Conference.2021:49-56.
[37]XU Z R,SHAN Z F,LI Y X,et al.Hansel:a chinese few-shot and zero-shot entity linking benchmark[C]//Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining.2023:832-840.
[38]HUANG S J,WANG B B,QIN L B,et al.Improving few-shot and zero-shot entity linking with coarse-to-fine lexicon-based retriever[C]//CCF International Conference on Natural Language Processing and Chinese Computing.Cham:Springer Nature Switzerland,2023:245-256.
[39]ZHOU H Y,SUN C J,LIN L,et al.ERNIE-AT-CEL:a chinese few-shot emerging entity linking model based on ERNIE and adversarial training[C]//CCF International Conference on Natural Language Processing and Chinese Computing.Cham:Springer Nature Switzerland,2023:48-56.
[40]LI Y X,ZHAO Y,HU B T,et al.Glyphcrm:bidirectional encoder representation for chinese character with its glyph [J].arXiv:2107.00395,2021.
[41]HUMEAU S,SHUSTER K,LACHAUX M A,et al.Poly-en-coders:architectures and pre-training strategies for fast and accurate multi-sentence scoring[C]//International Conference on Learning Representations.2019.
[42]ZENG W X,ZHAO X,TANG J Y,et al.Cleek:a chinese long-text corpus for entity linking[C]//Proceedings of the 12th Language Resources and Evaluation Conference.2020:2026-2035.
[43]GONG S,XIONG X,LI S,et al.Chinese entity linking with two-stage pre-training transformer encoders[C]//2022 International Conference on Machine Learning and Knowledge Engineering(MLKE).IEEE,2022:288-293.
[44]CUI Y,CHE W,LIU T,et al.Pre-training with whole word masking for chinese bert[J].IEEE/ACM Transactions on Audio,Speech,and Language Processing,2021,29:3504-3514.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!