计算机科学 ›› 2025, Vol. 52 ›› Issue (4): 262-270.doi: 10.11896/jsjkx.240100119
潘建1,2, 吴志伟1, 李燕君1
PAN Jian1,2, WU Zhiwei1, LI Yanjun1
摘要: 目前,在实体链接任务的研究中,对中文实体链接、新兴实体与不知名实体链接的研究较少。此外,传统的BERT模型忽略了中文的两个关键方面,即字形和部首,这两者为语言理解提供了重要的语法和语义信息。针对以上问题,提出了一种基于中文特征的零样本实体链接模型CGR-BERT-ZESHEL。该模型首先通过引入视觉图像嵌入和传统字符嵌入,分别将字形特征和部首特征输入模型,从而增强词向量特征并缓解未登录词对模型性能的影响;然后采用候选实体生成和候选实体排序两阶段的方法得到实体链接的结果。在Hansel和CLEEK两个数据集上进行实验,结果表明,与基线模型相比,CGR-BERT-ZESHEL模型在候选实体生成阶段的性能指标Recall@100提高了17.49%和7.34%,在候选实体排序阶段的性能指标Accuracy提高了3.02%和3.11%;同时,在Recall@100和Accuracy指标上的性能均优于其他对比模型。
中图分类号:
[1]BUNESCU R,PASCA M.Using encyclopedic knowledge fornamed entity disambiguation[C]//11th Conference of the European Chapter of the Association for Computational Linguistics.2006:9-16. [2]DE C N,AZIZ W,TITOV I.Question answering by reasoningacross documents with graph convolutional networks[C]//2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics.Association for Computational Linguistics,2019:2306-2317. [3]SHEN W,WANG J Y,HAN J W.Entity linking with a know-ledge base:issues,techniques,and solutions[J].IEEE Transactions on Knowledge and Data Engineering,2014,27(2):443-460. [4]CURRY A C,PAPAIOANNOU I,SUGLIA A,et al.Alana v2:Entertaining and informative open-domain social dialogue using ontologies and entity linking [C]//Alexa Prize Proceedings.2018. [5]LOGESWARAN L,CHANG M W,LEE K,et al.Zero-shot entity linking by reading entity descriptions[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.2019:3449-3460. [6]DEVLIN J,CHANG M W,LEE K,et al.Bert:pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of NAACL-HLT.2019. [7]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need [J].arXiv:1706.03762,2017. [8]YANG Z L,DAI Z H,YANG Y M,et al.Xlnet:generalized autoregressive pretraining for language understanding [C]//Proceedings of the 33rd International Conference on Neural Information Processing Systems.2019:5753-5763. [9]CLARK K,LUONG M T,LE Q V,et al.ELECTRA:pre-training text encoders as discriminators rather than generators[C]//International Conference on Learning Representations.2019. [10]LAN Z Z,CHEN M D,GOODMAN S,et al.ALBERT:a liteBERT for self-supervised learning of language representations[C]//International Conference on Learning Representations.2019. [11]LIU Y H,OTT M,GOYAL N,et al.Roberta:a robustly optimized bert pretraining approach [J].arXiv:1907.11692,2019. [12]RADFORD A,WU J,CHILD R,et al.Language models are unsupervised multitask learner-s [J].OpenAI blog,2019,1(8):9. [13]BROWN T,MANN B,RYDER N,et al.Language models are few-shot learners [J].Advances in Neural Information Proces-sing Systems,2020,33:1877-1901. [14]RAE J W,BORGEAUD S,CAI T,et al.Scaling language mo-dels:methods,analysis & insights from training gopher [J].ar-Xiv:2112.11446,2021. [15]WANG B.Mesh-Transformer-JAX:model-parallel implementation of transformer language model with JAX[EB/OL].https://github.com/kingoflolz/mesh-transformer-jax2021. [16]LEWIS M,LIU Y H,GOYAL N,et al.BART:denoising se-quence-to-sequence pre-training for natural language generation,translation,and comprehension[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.2020:7871-7880. [17]BAO H B,DONG L,WEI F R,et al.Unilmv2:pseudo-masked language models for unified language model pre-training[C]//International Conference on Machine Learning.PMLR,2020:642-652. [18]ZHU J H,XIA Y C,WU L J,et al.Incorporating BERT intoneural machine translation [C]//International Conference on Learning Representations.2019. [19]LI X Y,MENG Y X,SUN X F,et al.Is word segmentation necessary for deep learning of chinese representations?[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.2019:3242-3252. [20]SUN Y,WANG S H,LI Y K,et al.Ernie:enhanced representation through knowledge integration [J].arXiv:1904.09223,2019. [21]CUI Y M,CHE W X,LIU T,et al.Pre-training with wholeword masking for chinese bert[J].IEEE/ ACM Transactions on Audio,Speech,and Language Processing,2021,29:3504-3514. [22]SUN Z J,LI X Y,SUN X F,et al.ChineseBERT:chinese pretraining enhanced by glyph and pinyin information[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing(Volume 1:Long Papers).2021:2065-2075. [23]OUYANG L,WU J,JIANG X,et al.Training language models to follow instructions with human feedback [J].Advances in Neural Information Processing Systems,2022,35:27730-27744. [24]ABDULLAH M,MADAIN A,JARARWEH Y.ChatGPT:fundamentals,applications and social impacts[C]//2022 Ninth International Conference on Social Networks Analysis,Management and Security(SNAMS).IEEE,2022:1-8. [25]HE Z Y,LIU S J,LI M,et al.Learning entity representation forentity disambiguation[C]//Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics(Volume 2:Short Papers).2013:30-34. [26]CHEN Y,TAN Y S,WU Q B,et al.TGCEL:a chinese entity linking method based on topic relation graph[C]//2017 6th International Conference on Computer Science and Network Technology(ICCSN T).IEEE,2017:226-230. [27]OUYANG X Y,CHEN S D,ZHAO H,et al.A multi-crossmatching network for chinese named entity linking in short text[C]//Journal of Physics:Conference Series.IOP Publishing,2019. [28]HUA X Y,LI L,HUA L F,et al.XREF:entity linking for chinese news comments with supplementary article reference[C]//Automated Knowledge Base Construction.2020. [29]MURTY S,VERGA P,VILNIS L,et al.Hierarchical losses and new resources for fine-grained entity typing and linking[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics(Volume 1:Long Papers).2018:97-109. [30]RADFORD A,NARASIMHAN K,SALIMA-NS T,et al.Improving language understanding with unsupervised learning[J].Citado,2018,17:1-12. [31]YAMADA I,SHINDO H,TAKEDA H,et al.Joint learning of the embedding of words and entities for named entity disambiguation[C]//20th SIGNLL Conference on Computational Natural Language Learning,CoNLL 2016.Association for Computa-tional Linguistics(ACL),2016:250-259. [32]ROBERTSON S,ZARAGOZA H.The probabilistic relevanceframework:BM25 and beyond[J].Foundations and Trends? in Information Retrieval,2009,3(4):333-389. [33]WIATRAK M,ARVANITI E,BRAYNE A,et al.Proxy-based zero-shot entity linking by effective candidate retrieval[C]//Proceedings of the 13th International Workshop on Health Text Mining and Information Analysis(LOUHI).2022:87-99. [34]XU Z R,CHEN Y L,SHI S B,et al.Enhancing entity linking with contextualized entity embeddings [C]//CCF International Conference on Natural Language Processing and Chinese Computing.Cham:Springer Nature Switzerland,2022:228-239. [35]SEVGILI Ö,SHELMANOV A,ARKHIPOV M,et al.Neuralentity linking:a survey of models based on deep learning [J].Semantic Web,2022,13(3):527-570. [36]RISTOSKI P,LIN Z Z,ZHOU Q Z.KG-ZESHEL:knowledge graph-enhanced zero-shot entity linking[C]//Proceedings of the 11th on Knowledge Capture Conference.2021:49-56. [37]XU Z R,SHAN Z F,LI Y X,et al.Hansel:a chinese few-shot and zero-shot entity linking benchmark[C]//Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining.2023:832-840. [38]HUANG S J,WANG B B,QIN L B,et al.Improving few-shot and zero-shot entity linking with coarse-to-fine lexicon-based retriever[C]//CCF International Conference on Natural Language Processing and Chinese Computing.Cham:Springer Nature Switzerland,2023:245-256. [39]ZHOU H Y,SUN C J,LIN L,et al.ERNIE-AT-CEL:a chinese few-shot emerging entity linking model based on ERNIE and adversarial training[C]//CCF International Conference on Natural Language Processing and Chinese Computing.Cham:Springer Nature Switzerland,2023:48-56. [40]LI Y X,ZHAO Y,HU B T,et al.Glyphcrm:bidirectional encoder representation for chinese character with its glyph [J].arXiv:2107.00395,2021. [41]HUMEAU S,SHUSTER K,LACHAUX M A,et al.Poly-en-coders:architectures and pre-training strategies for fast and accurate multi-sentence scoring[C]//International Conference on Learning Representations.2019. [42]ZENG W X,ZHAO X,TANG J Y,et al.Cleek:a chinese long-text corpus for entity linking[C]//Proceedings of the 12th Language Resources and Evaluation Conference.2020:2026-2035. [43]GONG S,XIONG X,LI S,et al.Chinese entity linking with two-stage pre-training transformer encoders[C]//2022 International Conference on Machine Learning and Knowledge Engineering(MLKE).IEEE,2022:288-293. [44]CUI Y,CHE W,LIU T,et al.Pre-training with whole word masking for chinese bert[J].IEEE/ACM Transactions on Audio,Speech,and Language Processing,2021,29:3504-3514. |
|