计算机科学 ›› 2025, Vol. 52 ›› Issue (8): 277-287.doi: 10.11896/jsjkx.240600050
刘乐, 肖蓉, 杨肖
LIU Le, XIAO Rong, YANG Xiao
摘要: 文档级关系抽取是自然语言处理领域中的一个重要研究方向,旨在从无结构或半结构的自然语言文档中提取实体之间的语义关系。提出了结合使用解耦知识蒸馏方法和交叉多头注意力机制来解决文档级关系抽取任务。首先,交叉多头注意机制不仅能够并行关注不同注意力头中的元素,使模型在不同粒度和层级上进行信息的交流和整合,而且允许模型在计算头实体与尾实体之间的注意力时,同时考虑它们与关系之间的相关性,从而提升模型对复杂关系的理解能力,增强模型对实体特征表示的学习。此外,为了进一步优化模型性能,还引入了解耦知识蒸馏方法去适应远程监督数据。该方法将原始KL散度损失中的目标类别知识蒸馏损失TCKDL和非目标类别知识蒸馏损失NCKDL解耦为了两个可以通过超参数调整其权重重要性的独立部分,提高了知识蒸馏过程的灵活性和有效性,特别是在处理DocRED远程监督数据中的噪声时,能够更精准地进行知识迁移和学习。实验结果表明,所提模型在DocRED数据集上能够更有效地提取实体对之间的关系。
中图分类号:
[1]YANG Z,WANG Y,GAN J,et al.Design and research of intelligent question-answering(Q&A) system based on high school course knowledge graph[J].Mobile Networks and Applications,2021,26(5):1884-1890. [2]YU H,LI H,MAO D,et al.A relationship extracti-onmethod for domain knowledge graph construction[J].World Wide Web,2020,23:735-753. [3]XUW,CHEN K,ZHAO T.Document-level relation extraction with reconstruction[C]//Proceedings of the AAAI Conference on Artificial Inteligence.2021:14167-14175. [4]DEVLIN J,CHANG M W,LEE K,et al.Bert:Pretraining ofdeep bidirectional transformers for language understanding[J].arXiv:1810.04805,2018. [5]LIU Y,OTT M,GOYAL N,et al.Roberta:A robustly opti-mized bert pretraining approach[J].arXiv:1907.11692,2019. [6]ZHAO B,CUI Q,SONG R,et al.Decoupled knowledge distil-lation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:11953-11962. [7]YAO Y,YE D,LI P,et al.DocRED:A large-scale document-level relation extraction dataset[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.2019:764-777. [8]SCARSELLI F,GORI M,TSOI A C,et al.The graph neural network model[J].IEEE Transactions on Neural Networks,2008,20(1):61-80. [9]NAN G,GUO Z,SEKULIĆ I,et al.Reasoning with LatentStructure Refinement for Document-Level Relation Extraction[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.ACL,2020:1546-1557. [10]ZENG S,XU R,CHANG B,et al.Double Graph Based Reaso-ning for Document-level Relation Extraction[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing.2020:1630-1640. [11]XU J,CHEN Y,QIN Y,et al.A feature combination-basedgraph convolutional neural network model for relation extraction[J].Symmetry,2021,13(8):1458. [12]WANG N,CHEN T,REN C,et al.Document-level relation extraction with multi-layer heterogeneous graph attention network[J].Engineering Applications of Artificial Intelligence,2023,123:1-10. [13]WOLF T,DEBUT L,SANH V,et al.Transformers:State-of-the-Art Natural Language Processing[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing.2020:38-45. [14]VASWANI A,SHAZEER N,PARMARN,et al.Attention is all you need[C]//Advances in Neural Information Processing Systems.2017. [15]VERGA P,STRUBELL E,MCCALLUMA.Sim-ultaneouslyself-attending to all mentions for full-abstract biological relation extraction[J]. arXiv:1802.10569,2018. [16]ZHOU W,HUANG K,MAT,et al.Document-level relation extraction with adaptive thresholding andlocalized context pooling[C]//Proceedings of the AAAI Conferenceon Artificial Intelligence.2021:14612-14620. [17]XU B,WANG Q,LYU Y,et al.Entity structure within andthroughout:Modeling mention dependencies for document-level relation extraction[C]//Proceedings of the AAAI Conference on Artificial Intelligence,2021:14149-14157. [18]XIE Y,SHEN J,LI S,et al.Eider:empowering document-level relation extraction with efficient evidence extraction and infe-rence-stage fusion[C]//Proceedings of the Association for Computational Linguistics.2022:257-268. [19]TAN Q,HE R,BING L,et al.Document-level relation extraction with adaptive focal loss and knowledge distillation[C]//Proceedings of Findings of the Association for Computational Linguistics.ACL,2022:1672-1681. [20]MINTZ M,BILLS S,SNOW R,et al.Distant sup-ervision for relation extraction without labeled data[C]//Proceedings of the Joint Conference ofthe 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP.ACL,2009:1003-1011. [21]HINTO N G,VINYALS O,DEAN J.Distilling the knowledge in a neural network[J].arXiv:1503.02531,2015. [22]ZHANG L,SU J,MIN Z,et al.Exploring self-distillation based relational reasoning training for document-level relation extraction[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2023:13967-13975. [23]MA Y,WANG A,OKAZAKIN.DREEAM:Guiding attention with evidence for improving doc-ument-level relation extraction[J].arXiv:2302.08675,2023. [24]JIA R,WONG C,POON H.Document-Level Nary Relation Extraction with Multiscale Representation Learning[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2019:3693-3704. [25]VRANDEČIĆD,KRÖTZSCHM.Wikidata:a free collaborative knowledgebase[J].Communica-tions of the ACM,2014,57(10):78-85. [26]LOSHCHILOV I,HUTTERF.Decoupled weight decay regularization[J].arXiv:1711.05101,2017. [27]GOYALP,DOLLÁR P,GIRSHICK R,et al.Accurate,large minibatch sgd:Training imagenet in 1 hour[J].arXiv:1706.02677,2017. |
|