深度学习位置编码算法综述

doi:10.11896/jsjkx.250300107

Abstract

Abstract: In deep learning,positional encoding constitutes a critical component for enhancing neural networks' capabilities in understanding sequence structures.Particularly within Transformers and their variants,positional encoding addresses the inherent limitation of the self-attention mechanism,which lacks the ability to intrinsically capture sequential order.This paper systematically reviews the theoretical foundations of positional encoding,the conceptual design of various encoding strategies,and their applications across diverse neural network architectures.Initially,the paper revisits traditional models such as Recurrent Neural Networks(RNNs) and Long Short-Term Memory networks(LSTMs),discussing their implicit methods of modeling sequence positions and examining the theoretical motivations behind the introduction of explicit positional encoding in Transformers.Subsequently,a detailed exposition is presented on absolute positional encoding strategies-including sinusoidal positional encoding and learnable positional embeddings-relative positional encoding methods such as Transformer-XL and RoPE,bias-based positional encoding methods like ALiBi and KERPLE,and recent optimization techniques tailored for extremely long sequence tasks,notably NTK-aware RoPE,YaRN,and CoPE.Moreover,the paper conducts an in-depth analysis of positional encoding's impact on model performance,encompassing computational efficiency,extrapolation capabilities,and modeling of long-range dependencies.Frontier topics including numerical stability and frequency spectrum optimization are also addressed.Finally,the study summarizes current research trends in positional encoding and outlines its future prospects in areas such as large-scale sequence modeling,hybrid network architectures,and hierarchical data structure modeling.The overarching aim is to provide researchers and practitioners with a comprehensive and detailed reference to facilitate the selection of appropriate positional encoding methods for specific tasks and to foster further advancements in related fields.

Key words: Positional encoding, Transformer models, Self-attention mechanism, Sequence modeling, Rotary positional encoding

CLC Number:

TP183

YANG Geer, WANG Xin, SUN Wei, WANG Xinge, HU Zhongrui, MENG Wenjun, ZHANG Junqiang, WU Xinghui, LIU Jinshan, YAN Yuming. Survey on Positional Encoding Algorithms in Deep Learning[J].Computer Science, 2026, 53(6A): 250300107-16.

References

[1] DEVLIN J,CHANG M W,LEE K,et al.BERT:Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.Minneapolis:Association for Computational Linguistics,2019:4171-4186.
[2] LIU X Q,YU H F,DHILLON I S,et al.Learning to encode position for transformer with continuous dynamical model[C]//Proceedings of the 37th International Conference on Machine Learning.New York:PMLR,2020:6327-6335.
[3] WANG S,LI B Z,KHABSA M,et al.Linformer:Self-attention with linear complexity[J].arXiv:2006.04768,2020.
[4] KIYONO S,KOBAYASHI S,SUZUKI J,et al.SHAPE:Shifted absolute position embedding for transformers[C]//Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.Punta Cana:Association for Computational Linguistics,2021:3309-3321.
[5] CHEN P C,TSAI H,BHOJANAPALLI S,et al.A simple and effective positional encoding for transformers[C]//Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.Punta Cana:Association for Computational Linguistics,2021:2974-2988.
[6] KE G L,HE D,LIU T Y.Rethinking positional encoding in language pre-training[C]//International Conference on Learning Representations.2021.
[7] HE P C,LIU X D,GAO J F,et al.DeBERTa:Decoding-en-hanced BERT with disentangled attention[C]//International Conference on Learning Representations.2021.
[8] YANG Z L,DAI Z H,YANG Y M,et al.XLNet:Generalized autoregressive pretraining for language understanding[C]//Advances in Neural Information Processing Systems.2019:5753-5763.
[9] HE Z Y,FENG G H,LUO S J,et al.Two stones hit one bird:Bilevel positional encoding for better length extrapolation[C]//Proceedings of the 41st International Conference on Machine Learning.Vienna:PMLR,2024:17858-17876.
[10] PRESS O,SMITH N A,LEWIS M.Train short,test long:Attention with linear biases enables input length extrapolation[C]//International Conference on Learning Representations.2022.
[11] RAFFEL C,SHAZEER N,ROBERTS A,et al.Exploring the limits of transfer learning with a unified text-to-text transformer[J].Journal of Machine Learning Research,2020,21(140):1-67.
[12] WU C H,WU F Z,HUANG Y F.DA-Transformer:Distance-aware transformer[C]//Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.Stroudsburg:Association for Computational Linguistics,2021:2059-2068.
[13] HUANG Z H,LIANG D,XU P,et al.Improve transformermodels with better relative position embeddings[C]//Findings of the Association for Computational Linguistics:EMNLP 2020.Stroudsburg:Association for Computational Linguistics,2020:3327-3335.
[14] SHEN T,ZHOU T Y,LONG G D,et al.DiSAN:Directionalself-attention network for RNN/CNN-free language understanding[C]//Proceedings of the Thirty-Second AAAI Confe-rence on Artificial Intelligence.New Orleans:AAAI Press,2018:5446-5455.
[15] NEISHI M,YOSHINAGA N.On the relation between position information and sentence length in neural machine translation[C]//Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL).Hong Kong:Association for Computational Linguistics,2019:328-338.
[16] LIUTKUS A,CÍFKA O,WU S H,et al.Relative positional encoding for transformers with linear complexity[C]//Proceedings of the 38th International Conference on Machine Learning.New York:PMLR,2021:7067-7079.
[17] CHI T C,FAN T H,RAMADGE P J,et al.KERPLE:Kernelized relative positional embedding for length extrapolation[C]//Advances in Neural Information Processing Systems.2022:12058-12068.
[18] CHI T C,FAN T H,GU L,et al.Dissecting transformer length extrapolation via the lens of receptive field analysis[C]//Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1:Long Papers).Toronto:Association for Computational Linguistics,2023:13522-13537.
[19] LI S D,LIU C,ZHOU L,et al.Functional interpolation for relative positions improves long-context transformers[C]//The Twelfth International Conference on Learning Representations.2024.
[20] DEHGHANI M,GOUWS S,VINYALS O,et al.Universaltransformers[C]//International Conference on Learning Representations.2019.
[21] VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[C]//Proceedings of the 31^st International Confe-rence on Neural Information Processing Systems.2017:6000-6010.
[22] SCHLAG I,SMOLENSKY P,FERNANDEZ R,et al.Enhancing the transformer with explicit relational encoding for math problem solving[C]//NeurIPS 2019 Workshop on Context and Compositionality.Vancouver:NeurIPS,2019.
[23] LIKHOMANENKO T,XU Q,SYNNAEVE G,et al.CAPE:Encoding relative positions with continuous augmented positional embeddings[C]//NeurIPS 2021.2021:16079-16092.
[24] KITAEV N,KAISER Ł,LEVSKAYA A.Reformer:The efficient transformer[C]//International Conference on Learning Representations.2020.
[25] LI Y Y,FANG Y X,LONG T.Noise-robust autoregressivetransformer for aircraft trajectory prediction via hybridposi-tional encoding[J].Scientific Reports,2025,15(1):11370.
[26] YAN H,DENG B,LI X,et al.TENER:Adapting transformer encoder for named entity recognition[J].arXiv:1911.04474,2019.
[27] SU J L,LU Y,PAN S,et al.RoFormer:Enhanced transformer with rotary position embedding[J].Neurocomputing,2024,568:127063.
[28] SUN Y T,DONG L,YI B,et al.A length-extrapolatable transformer[C]//Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1:Long Papers).Toronto:Association for Computational Linguistics,2023:14590-14604.
[29] SHIV V L,QUIRK C.Novel positional encodings to enable tree-based transformers[C]//Proceedings of the 33^rd International Conference on Neural Information Processing Systems.2019:12058-12068.
[30] YING C X,CAI T,LUO S,et al.Do transformers really perform bad for graph representation?[C]//Proceedings of the 35^th Conference on Neural Information Processing Systems.2021:28800-28814.
[31] ZHANG J W,ZHANG H P,XIA C Y,et al.Graph-BERT:Only attention is needed for learning graph representations[C]//Proceedings of the 29th ACM International Conference on Information & Knowledge Management.New York:ACM,2020:2325-2328.
[32] DWIVEDI V P,BRESSON X.A generalization of transformer networks to graphs[J].arXiv:2012.09699,2020.
[33] PARK W,CHANG W G,LEE D,et al.GRPE:Relative positional encoding for graph transformer[J].arXiv:2201.12787,2022.
[34] LUO Y K,LIU H,LIU Z,et al.Enhancing graph transformers with hierarchical distance structural encoding[C]//Advances in Neural Information Processing Systems 37.New York:Curran Associates,Inc.,2024.
[35] LAKEW S M,DI GANGI M,FEDERICO M.Controlling theoutput length of neural machine translation[C]//Proceedings of the 16th International Conference on Spoken Language Translation.Hong Kong:Association for Computational Linguistics,2019:284-292.
[36] TAKASE S,OKAZAKI N.Positional encoding to control output sequence length[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.Minneapolis:Association for Computational Linguistics,2019:3999-4004.
[37] OKA Y,KAJIWARA T,ARASE Y.Incorporating noisy length constraints into transformer with length-aware positional enco-dings[C]//Proceedings of the 28th International Conference on Computational Linguistics.Barcelona:International Committee on Computational Linguistics,2020:3580-3585.
[38] BUTCHER B,O'KEEFE M,TITCHENER J.Precise lengthcontrol in large language models[J].arXiv:2412.11937,2024.
[39] CONNEAU A,LAMPLE G.Cross-lingual language model pretraining[C]//Advances in Neural Information Processing Systems 32.New York:Curran Associates,Inc.,2019:7059-7069.
[40] GUMMA V,CHITALE P A,BALI K.On the interchangeability of positional embeddings in multilingual neural machine translation models[J].arXiv:2408.11382,2024.
[41] RAVISHANKAR V,SØGAARD A.The impact of positionalencodings on multilingual compression[C]//Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.Punta Cana:Association for Computational Linguistics,2021:843-848.
[42] ITO T,TSUCHIYA K,KANAZAWA T,et al.Learning positional encodings in transformers depends on initialization[J].arXiv:2406.08272,2024.
[43] WANG Y A,CHEN Y N.What do position embeddings learn? An empirical study of pre-trained language model positional encoding[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing.Online:Association for Computational Linguistics,2020:6840-6849.
[44] GU Z H,LIU Y,ZHAO H,et al.Unpacking positional encoding in transformers:A spectral analysis of content-position coupling[J]arXiv:2505.13027,2024.
[45] DUFTER P,SCHMITT M,SCHÜTZE H.Position information in transformers:An overview[J].Computational Linguistics,2022,48(3):733-763.
[46] WANG B Y,SHANG L F,LI C,et al.On position embeddings in BERT[C]//International Conference on Learning Representations.2021.
[47] BARBERO F,JALALZADEH A,PONTIL M,et al.Round and round we go:What makes rotary positional encodings useful?[J].arXiv:2410.06205,2024.
[48] WU X Y,ZHAO H,ZHANG M.On the emergence of position bias in transformers[C]//Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1:Long Papers).Toronto:Association for Computational Linguistics,2023:1144-1158.
[49] GIBSON E.Linguistic complexity:Locality of syntactic dependencies[J].Cognition,1998,68(1):1-76.
[50] FUTRELL R,MAHOWALD K,GIBSON E.Large-scale evi-dence of dependency length minimization in 37 languages[J].Proceedings of the National Academy of Sciences,2015,112(33):10336-10341.
[51] BOCK J K,WARREN R K.Conceptual accessibility and syntactic structure in sentence formulation[J].Cognition,1985,21(1):47-67.
[52] FERREIRA V S,YOSHITA H.Given-new ordering effects on the production of scrambled sentences in Japanese[J].Journal of Psycholinguistic Research,2003,32(6):669-692.
[53] LERNER Y,HONEY C J,SILBERT L J,et al.Topographicmapping of a hierarchy of temporal receptive windows using a narrated story[J].Journal of Neuroscience,2011,31(8):2906-2915.
[54] HEWITT J,MANNING C D.A structural probe for findingsyntax in word representations[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.Minneapolis:Association for Computational Linguistics,2019:4129-4138.
[55] CLARK K,KHANDELWAL U,LEVY O,et al.What doesBERT look at? An analysis of BERT's attention[C]//Procee-dings of the 2019 ACL Workshop BlackboxNLP:Analyzing and Interpreting Neural Networks for NLP.Florence:Association for Computational Linguistics,2019:276-286.
[56] ALMEIDA-FILHO D G,LOPES-DOS-SANTOS V,VASCONCELOS N A P,et al.An investigation of Hebbian phase sequences as assembly graphs[J].Frontiers in Neural Circuits,2014,8:34.
[57] MILLIDGE B,TSCHANTZ A,SETH A,et al.Predictive coding networks for temporal prediction[J].PLOS Computational Bio-logy,2024,20(4):e1011183.
[58] MAKUUCHI M,BACHER J,FRIEDERICI A D.Segregatingthe core computational faculty of human language from working memory[J].Proceedings of the National Academy of Sciences,2009,106(20):8362-8367.
[59] HOCHREITER S,SCHMIDHUBER J.Long short-term memory[J].Neural Computation,1997,9(8):1735-1780.
[60] PASCANU R,MIKOLOV T,BENGIO Y.On the difficulty of training recurrent neural networks[C]//Proceedings of the 30th International Conference on Machine Learning.Atlanta:PMLR,2013:1310-1318.
[61] BENGIO Y,SIMARD P,FRASCONI P.Learning long-term dependencies with gradient descent is difficult[J].IEEE Transactions on Neural Networks,1994,5(2):157-166.
[62] LUONG T,PHAM H,MANNING C D.Effective approaches to attention-based neural machine translation[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing.Lisbon:Association for Computational Linguistics,2015:1412-1421.
[63] CHEN S M,WONG S,CHEN L Q,et al.Extending contextwindow of large language models via positional interpolation[J].arXiv:2306.15595,2023.
[64] RUMELHART D E,HINTON G E,WILLIAMS R J.Learning representations by back-propagating errors[J].Nature,1986,323(6088):533-536.
[65] PASCANU R,MIKOLOV T,BENGIO Y.On the difficulty of training recurrent neural networks[C]//Proceedings of the 30th International Conference on Machine Learning.Atlanta:PMLR,2013:1310-1318.
[66] SCHUSTER M,PALIWAL K K.Bidirectional recurrent neural networks[J].IEEE Transactions on Signal Processing,1997,45(11):2673-2681.
[67] BAHDANAU D,CHO K,BENGIO Y.Neural machine translation by jointly learning to align and translate[C]//International Conference on Learning Representations.2015.
[68] SHAW P,USZKOREIT J,VASWANI A.Self-attention withrelative position representations[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.New Orleans:Association for Computational Linguistics,2018:464-468.
[69] CHILD R,GRAY S,RADFORD A,et al.Generating long sequences with sparse transformers[J].arXiv:1904.10509,2019.
[70] DAI Z H,YANG Z L,YANG Y M,et al.Transformer-XL:Attentive Language Models Beyond a Fixed-Length Context[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.2019:2978-2988.
[71] CHOROMANSKI K,LIKHOSHERSTOVV,DOHAN D,et al.Rethinking Attention with Performers [C]//Proceedings of the 9th International Conference on Learning Representations.2021.
[72] BANIATA L H,KANG S,AMPOMAH L K E.A Reverse Positional Encoding Multi-Head Attention-Based Neural Machine Translation Model for Arabic Dialects[J].Mathematics,2022,10(19):3666.
[73] ZHENGJ,REZAGHOLIZADEHM,PASSBAN P.Dynamic Position Encoding for Transformers[C]//Proceedings of the 29th International Conference on Computational Linguistics.2022:5076-5084.
[74] RUOSS A,DELÈTANG G,GENEWEIN T,et al.Randomized Positional Encodings Boost Length Generalization of Transformers[C]//Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics.2023:1889-1903.
[75] GOLOVNEVA O,WANG T L,WESTON J,et al.Contextual Position Encoding:Learning to Count What's Important[J]arXiv:2405.18719,2024.
[76] bloc97.NTK-Aware RoPE Scaling for Efficient Extrapolation[EB/OL].https://www.reddit.com/r/LocalLLaMA/ comments/14lz7j5/ntkaware_scaled_rope_allows_llama_models_to_have/.Technical note.
[77] PENG B W,QUESNELLE J,FAN H L,et al.YaRN:Efficient Context Window Extension of Large Language Models[C]//Proceedings of the 12th International Conference on Learning Representations.2024.
[78] WESTON J,CHOPRA S,BORDES A.Memory Networks[C]//3rd International Conference on Learning Representations.2015.
[79] KALCHBRENNER N,ESPEHOLT L,SIMONYAN K,et al.Neural Machine Translation in Linear Time[J].arXiv:1610.10099,2016.
[80] GEHRING J,AULI M,GRANGIER D,et al.Convolutional sequence to sequence learning[C]//Proceedings of the 34th International Conference on Machine Learning.Sydney:PMLR,2017:1243-1252.
[81] DOSOVITSKIY A,BEYER L,KOLESNIKOV A,et al.An image is worth 16x16 words:Transformers for image recognition at scale[C]//International Conference on Learning Representations.2021.
[82] BLACK S,BIDERMAN S,HALLAHAN E,et al.GPT-NeoX-20B:An open-source autoregressive language model[C]//Proceedings of BigScience Episode #5－Workshop on Challenges &Perspectives in Creating Large Language Models.Dublin:Association for Computational Linguistics,2022:95-136.
[83] TOUVRON H,LAVRIL T,IZACARD G,et al.LLaMA:Open and efficient foundation language models[J].arXiv:2302.13971,2023.
[84] HEO B,RYU J,CHOI J,et al.Rotary position embedding forvision transformer[C]//Computer Vision－ECCV 2024.Milan:Springer,2024:174-190.
[85] FANG Y X,SUN Q,WANG X G,et al.EVA-02:A visual representation for neon genesis[J].Image and Vision Computing,2024,149:105171.
[86] LIU Z K,ZHANG H,LIU Y,et al.VRoPE:Rotary position embedding for video large language models[J].arXiv:2502.11664,2025.
[87] SUKHBAATAR S,WESTON J,FERGUS R,et al.End-to-end memory networks[C]//Proceedings of the 29th International Conference on Neural Information Processing Systems.2015:
2440-2448.
[88] LIU X R,ZOU H,KONG L,et al.Scaling laws of RoPE-based extrapolation[J].arXiv:2310.05209,2023.
[89] JACOT A,GABRIEL F,HONGLER C.Neural tangent kernel:Convergence and generalization in neural networks[C]//Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing.2018:8571-8580.
[90] TANCIK M,SRINIVASAN P,MILDENHALL B,et al.Fourier features let networks learn high frequency functions in low dimensional domains[C]//Proceedings of the 34th International Conference on Neural Information Processing System.2020:7537-7547.
[91] LU E Z,WANG X,YANG Z,et al.MoBA:Mixture of block attention for long-context LLMs[J].arXiv:2502.13189,2025.
[92] YUAN J Y,LIU Z,MA X,et al.Native sparse attention:Hardware-aligned and natively trainable sparse attention[J].arXiv:2502.11089,2025.
[93] GU A,DAO T.Mamba:Linear-time sequence modeling with selective state spaces[C]//The Twelfth International Conference on Learning Representations.2024.
[94] BORGEAUD S,MENSCH A,HOFFMANN J,et al.Improving language models by retrieving from trillions of tokens[C]//Proceedings of the 39th International Conference on Machine Learning.Baltimore:PMLR,2022:2206-2240.
[95] ZHENG C Y,LIU Z,WANG X,et al.DAPE:Data-adaptive positional encoding for length extrapolation[C]//NeurIPS 2024.2024.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Survey on Positional Encoding Algorithms in Deep Learning

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0

[1]	LI Jie, WANG Baohui, ZHANG Jingyuan. DDoS Attack Detection Based on Attention Mechanism TCN-BiLSTM [J]. Computer Science, 2026, 53(6A): 250300060-9.
[2]	CHANG Xuanwei, DUAN Liguo, CHEN Jiahao, CUI Juanjuan, LI Aiping. Method for Span-level Sentiment Triplet Extraction by Deeply Integrating Syntactic and Semantic Features [J]. Computer Science, 2026, 53(2): 322-330.
[3]	GUAN Xin, YANG Xueyong, YANG Xiaolin, MENG Xiangfu. Tumor Mutation Prediction Model of Lung Adenocarcinoma Based on Pathological [J]. Computer Science, 2025, 52(6A): 240700010-8.
[4]	LI Daicheng, LI Han, LIU Zheyu, GONG Shiheng. MacBERT Based Chinese Named Entity Recognition Fusion with Dependent Syntactic Information and Multi-view Lexical Information [J]. Computer Science, 2025, 52(6A): 240600121-8.
[5]	HOU Zhexiao, LI Bicheng, CAI Bingyan, XU Yifei. High Quality Image Generation Method Based on Improved Diffusion Model [J]. Computer Science, 2025, 52(6A): 240500094-9.
[6]	HU Jintao, XIAN Guangming. Self-attention-based Graph Contrastive Learning for Recommendation [J]. Computer Science, 2025, 52(11): 82-89.
[7]	LI Jiaying, LIANG Yudong, LI Shaoji, ZHANG Kunpeng, ZHANG Chao. Study on Algorithm of Depth Image Super-resolution Guided by High-frequency Information ofColor Images [J]. Computer Science, 2024, 51(7): 197-205.
[8]	QUE Yue, GAN Menghan, LIU Zhiwei. Object Detection with Receptive Field Expansion and Multi-branch Aggregation [J]. Computer Science, 2024, 51(6A): 230600151-6.
[9]	LIU Xiaohu, CHEN Defu, LI Jun, ZHOU Xuwen, HU Shan, ZHOU Hao. Speaker Verification Network Based on Multi-scale Convolutional Encoder [J]. Computer Science, 2024, 51(6A): 230700083-6.
[10]	ZHANG Lanxin, XIANG Ling, LI Xianze, CHEN Jinpeng. Intelligent Fault Diagnosis Method for Rolling Bearing Based on SAMNV3 [J]. Computer Science, 2024, 51(6A): 230700167-6.
[11]	ZHANG Feng, HUANG Shixin, HUA Qiang, DONG Chunru. Novel Image Classification Model Based on Depth-wise Convolution Neural Network andVisual Transformer [J]. Computer Science, 2024, 51(2): 196-204.
[12]	REN Yuheng, ZHAO Yunfeng, WU Chuang. Deep Gait Recognition Network Based on Relative Position Encoding Transformer [J]. Computer Science, 2024, 51(11A): 240400064-6.
[13]	XU Junwen, CHEN Zonglei, LI Tianrui, LI Chongshou. Time Series Prediction of Hybrid Neural Networks Based on Seasonal Decomposition [J]. Computer Science, 2024, 51(11A): 231200008-7.
[14]	YAO Tianlei, CHEN Xiliang, YU Peiyi. Review of Generative Reinforcement Learning Based on Sequence Modeling [J]. Computer Science, 2024, 51(11): 213-228.
[15]	ZHOU Xueyang, FU Qiming, CHEN Jianping, LU You, WANG Yunzhe. Chemical-induced Disease Relation Extraction:Graph Reasoning Method Based on Evidence Focusing [J]. Computer Science, 2024, 51(10): 351-361.