计算机科学 ›› 2023, Vol. 50 ›› Issue (5): 189-200.doi: 10.11896/jsjkx.220400252
祝涛杰1, 卢记仓1,2, 周刚1,2, 丁肖摇1, 王凌1, 朱秀宝1
ZHU Taojie1, LU Jicang1,2, ZHOU Gang1,2, DING Xiaoyao1, WANG Ling1, ZHU Xiubao1
摘要: 关系抽取是信息抽取研究的重要方向,已逐步从句子级扩展到了文档级。与句子相比,文档通常蕴含更多的关系事实,可为知识库构建、信息检索和语义分析等提供更多的信息支持。然而,文档级关系抽取复杂度更高,难度更大,目前缺乏较为系统全面的梳理和总结。为更好地促进文档级关系抽取的深入研究与发展,文中对已有技术和方法进行了综合深入分析,从数据预处理方式和核心算法角度,将已有文档级关系抽取研究大致分为基于树、基于序列和基于图3种类别;在此基础上,分析描述了各类研究中的部分典型方法、最新进展以及存在的不足;同时,介绍了现有研究中部分常用数据集和性能评价指标,并列出了已有部分典型方法的具体性能;最后,对现有文档级关系抽取研究存在的问题进行了分析和总结,指出了未来可能的发展趋势及可进一步深入关注的研究方向。
中图分类号:
[1]KADRY A,DIETZ L.Open relation extraction for support passage retrieval:Merit and open issues[C]//Proceedings of the 40th International Conference on Research and Development in Information Retrieval.ACM,2017:1149-1152. [2]MO Y,YIN W,HASAN K S,et al.Improved Neural RelationDetection for Knowledge Base Question Answering[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.ACL,2017:571-581. [3]YOUNG T,CAMBRIA E,CHATURVEDI I,et al.Augmenting End-to-End Dialog Systems with Commonsense Knowledge[C]//Proceedings of the 32th AAAI Conference on Artificial Intelligence.AAAI Press,2018:4970-4977. [4]YAO Y,YE D,LI P,et al.DocRED:A Large-Scale Document-Level Relation Extraction Dataset[C]//Proceedings of the 57th Conference of the Association for Computational Linguistics.ACL,2019:764-777. [5]CHENG Q,LIU J,QU X,et al.HacRED:A Large-Scale Relation Extraction Dataset Toward Hard Cases in Practical Applications[C]//Proceedings of the Association for Computational Linguistics.ACL,2021:2819-2831. [6]LI D M,ZHANG Y,LI D Y,et al.Review of Entity RelationExtraction Methods[J].Journal of Computer Research and Development,2020,57(7):25. [7]E H Y,ZHANG W J,XIAO S Q,et al.Survey of Entity Relationship Extraction Based on Deep Learning[J].Journal of Software,2019,30(6):26. [8]HIRANO T,ASANO H,MATSUO Y,et al.Recognizing Relation Expression between Named Entities based on Inherent and Context-dependent Features of Relational words[C]//Procee-dings of the 23th International Conference on Computational Linguistics.COLING,2010:409-417. [9]GUPTA P,RAJARAM S,SCHUTZE H,et al.Neural Relation Extraction Within and Across Sentence Boundaries[C]//Proceedings of the 32th Conference on Artificial Intelligence.AAAI Press,2019:6513-6520. [10]TANG H,CAO Y,ZHANG Z,et al.HIN:Hierarchical Infe-rence Network for Document-Level Relation Extraction[C]//Proceedings of the 24th Pacific-Asia Conference.Cham:Sprin-ger,2020:197-209. [11]LI J,XU K,LI F et al,MRN:A Locally and Globally Mention-Based Reasoning Network for Document-Level Relation Extraction[C]//Proceedings of the Association for Computational Linguistics.ACL/IJCNLP,2021:1359-1370. [12]HUANG Q,ZHU S,FENG Y,et al.Three Sentences Are All You Need:Local Path Enhanced Document Relation Extraction[C]//Proceedings of the 59th Association for Computational Linguistics.ACL,2021:998-1004. [13]JIA R,WONG C,POON H.Document-Level N-ary RelationExtraction with Multiscale Representation Learning[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics.ACL,2019:3693-3704. [14]ZHANG N,CHEN X,XIE X,et al.Document-level RelationExtraction as Semantic Segmentation[C]//Proceedings of the 30th International Joint Conference on Artificial Intelligence.IJCAI,2021:3999-4006. [15]XU B,WANG Q,LU Y Y,et al.Entity Structure Within and Throughout:Modeling Mention Dependencies for Document-Level Relation Extraction[C]//Proceedings of the 33th AAAI Conference on Artificial Intelligence.AAAI Press,2021:14149-14157. [16]XIE Y,SHEN J,LI S,et al.Eider:Evidence-enhanced Docu-ment-level Relation Extraction[J].arXiv:2106.08657,2021. [17]WANG H,FOCKE C,SYLVESTER R,et al.Fine-tune Bert for DocRED with Two-step Process[J].arXiv:1909.11898,2019. [18]HUANG K,WANG G,MA T,et al.Entity and EvidenceGuided Relation Extraction for DocRED[J].arXiv:2008.12283,2020. [19]LEE J,YOON W,KINM S,et al.BioBERT:a pre-trained biomedical language representation model for biomedical text mi-ning[J].Bioinformatics,2019,36(4):1234-1240. [20]QIN Y,LIN Y,TAKANOBU R,et al.ERICA:Improving Entity and Relation Understanding for Pre-trained Language Models via Contrastive Learning[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing.ACL,2021:3350-3363. [21]XIAO Y,ZHANG Z,MAO Y,et al.SAIS:Supervising and Aug-menting Intermediate Steps for Document-Level Relation Extraction[J].arXiv:2109.12093,2021. [22]ZHENG W,LIU X,LIU X,et al.An effective neural model extracting document level chemical-induced disease relations from biomedical literature[J].Journal of Biomedical Informatics,2018,83:1-9. [23]NAN G,GUO Z,SEKULIC I,et al.Reasoning with LatentStructure Refinement for Document-Level Realtion Extraction[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.ACL,2020:1546-1557. [24]ZHOU H,XU Y,YAO W,et al.Global Context-enhancedGraph Convolutional Networks for Document-level Relation Extraction[C]//Proceedings of the 28th International Conference on Computational Linguistics.COLING,2020:5259-5270. [25]DAI D,REN J,ZENG S,et al.Coarse-to-Fine Entity Representations for Document-level Relation Extraction[J].arXiv:2012.02507,2020. [26]WANG D,HU W,CAO E,et al.Global-to-Local Neural Networks for Document-Level Relation Extraction[C]//Procee-dings of the 2020 Conference on Empirical Methods in Natural Language Processing.ACL,2020:3711-3721. [27]LI B,YE W,SHENG Z,et al.Graph Enhanced Dual Attention Network for Document-Level Relation Extraction[C]//Proceedings of the 28th International Conference on Computational Linguistics.COLING,2020:1551-1560. [28]ZHANG Z,YU B,SHU X,et al.Document-level Relation Extraction with Dual-tier Heterogeneous Graph[C]//Proceedings of the 28th International Conference on Computational Linguistics.COLING,2020:1630-1641. [30]SHI Y,XIAO Y,QUAN P,et al.Document-level relation ex-traction via graph transformer networks and temporal convolutional networks[J].Pattern Recognition Letters,2021,149:150-156. [31]MAKINO,K,MAKOTO M,YUTAKA S.A Neural Edge-Editing Approach for Document-Level Relation Graph Extraction[C]//Proceedings of the Association for Computational Linguistics.ACL,2021:2653-2662. [32]XU W,CHEN K,ZHAO T.Discriminative Reasoning for Document-level Relation Extraction[C]//Proceedings of the Asso-ciation for Computational Linguistics.ACL,2021:1653-1663. [33]SWAMPILLAI K,STEVENSON M.Extracting Relations With-in and Across Sentences[C]//Recent Advances in Natural Language Processing.RANLP,2011:25-32. [34]BORDES A,USUNIER N,GARCIA-DURAN A,et al.Translating Embeddings for Modeling Multi-relational Data[C]//Proceedings of the 27th Advances in Neural Information Processing Systems.NIPS,2013:2787-2795. [35]CHEN Q,ZHU X,LING Z,et al.Enhanced LSTM for Natural Language Inference[C]//Proceedings of the 55th Association for Computational Linguistics.ACL,2017:1657-1668. [36]ZENG D,LIU K,LAI S,et al.Relation classification via convolutional deep neural network[C]//Proceedings of the 25th International Conference on Computational Linguistics.ACL,2014:2335-2344. [37]RONNEBERGER O,FIS-CHER P,BROX T.U-net:Convolutional networks for biomedical image segmentation[C]//Proceedings of the 18th International Conference Munich.Cham:Springer,2015:234-241. [38]VERGA P,STRUBELL E,MCCALLUM A.SimultaneouslySelf-Attending to All Mentions for Full-Abstract Biological Relation Extraction[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics.ACL,2018:872-884. [39]YAN L,HAN X,SUN L,et al.From Bag of Sentences to Document:Distantly Supervised Relation Extraction via Machine Reading Comprehension[J].arXiv:2012.04334,2020. [40]LI B,YE W,HUANG C,et al.Multi-view Inference for Relation Extraction with Uncertain Knowledge[C]//Proceedings of the 35th Conference on Artificial Intelligence.AAAI Press,2021:13234-13242. [41]WU W,LI H,WANG H,et al.Probase:A probabilistic taxonomy for text understanding[C]//Proceedings of the International Conference on Management of Data.ACM,2012:481-492. [42]ZHOU W,HUANG K,MA T,et al.Document-Level Relation Extraction with Adaptive Thresholding and Localized Context Pooling[C]//Proceedings of the 33th Conference on Artificial Intelligence.AAAI Press,2021:14612-14620. [43]YUAN C,HUANG H,FENG C,et al.Document-level relationextraction with Entity-Selection Attention[J].Information Sciences,2021(568):163-174. [44]XIAO C,YAO Y,XIE R,et al.Denoising Relation Extraction from Document-Level Distant Supervision[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing.ACL,2020:3683-3688. [45]KIM Y,DENTON C,HOANG L,et al.Structured attention networks[C]//Proceedings of the 5th International Conference on Learning Representations.ICLR,2017. [46]TERRY K,AMIR G,XAVIER C,et al.Structured prediction models via the matrix-tree theorem[C]//Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Lear-ning.ACL,2007:141-150. [47]KIPF T,WELLING M.Semi-supervised classification withgraph convolutional networks[C]//Proceedings of the 5th International Conference on Learning Representations.ICLR,2017. [48]GUO Z,ZHANG Y,LU W.Attention Guided Graph Convolutional Networks for Relation Extraction[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.ACL,2019:241-251. [49]SAHU S K,CHRISTOPOULOU F,MIWA M,et al.Inter-sentence Relation Extraction with Document-level Graph Convolutional Neural Network[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.ACL,2019:4309-4316. [50]ROBERTS A,GAIZAUSKAS R,HEPPLE M,et al.Semanticannotation of clinical text:The CLEF corpus[C]//Proceedings of the 2007 American Medical Informatics Association Annual Symposium.AMIA,2008:19-26. [51]HUANG G,LIU Z,VAN DER MAATEN L,et al.Densely connected convolutional networks[C]//Proceedings of the 2017 Conference on Computer Vision and Pattern Recognition.IEEE,2017:4700-4708. [52]VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[C]//Proceedings of 30th Advances in Neural Information Processing Systems.NIPS,2017:5998-6008. [53]SCHLICHTKRULL M,KIPF T N,BLOEMP P,et al.Modeling relational data with graph convolutional networks[C]//Proceedings of the 15th European Semantic Web Conference.Cham:Springer,2018:593-607. [54]PAN J,PENG M,ZHANG Y.Mention-centered Graph Neural Network for Document-level Relation Extraction[J].arXiv:2103.08200,2021. [55]XU W,CHEN K,ZHAO T.Document-Level Relation Extraction with Reconstruction[C]//Proceedings of the 33th Confe-rence on Artificial Intelligence.AAAI Press,2020:14167-14175. [56]WANG H,QIN K,LU G,et al.Document-level relation extraction using evidence reasoning on RST-GRAPH[J].Knowledge-Based Systems,2021(228):107274. [57]MANN W C,THOMPSON S A.Rhetorical structure theory:Toward a functional theory of text organization[J].Text-interdisciplinary Journal for the Study of Discourse,1988,8(3):243-281. [58]ZENG S,XU R,CHANG B,et al.Double Graph Based Reaso-ning for Document-level Relation Extraction[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing.ACL,2020:1630-1640. [59]ZENG S,WU Y,CHANG B.Sire:Separate intra-and inter-sentential reasoning for document-level relation extraction[J].ar-Xiv:2106.01709,2021. [60]PENG N,POON H,QUIRK C,et al.Cross-Sentence N-ary Relation Extraction with Graph LSTMs[J].Transactions of the Association for Computational Linguistics,2017,5(1):101-115. [29]CHRISTOPOULOU F,MIWA M,ANANIADOU S.Connec-ting the Dots:Document-level Neural Relation Extraction with Edge-oriented Graphs[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing.ACL,2019:4924-4935. [61]CHRISTOPOULOU F,MIWA M,ANANIADOU S.A walk-based model on entity graphs for relation extraction[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics.ACL,2018:81-88. [62]BAI S,KOLTER J Z,KOLTUN V.An empirical evaluation of generic convolutional and recurrent networks for sequence mo-deling[J].arXiv:1803.01271,2018. [63]YUN S,JEONG M,KIN R,et al.Graph transformer networks[C]//NeurIPS2019.2019:11960-11970. [64]BELTAGY I,PETERS M E,COHAN A.Longformer:Thelong-document transformer[J].arXiv:2004.05150,2020. [65]GRISHAMN R,SUNDHEIM B.Message Understanding Conference-6:a brief history[C]//Proceedings of the 16th Confe-rence on Computational Linguistics.ACL,1996:466-471. [66]ZHANG Y,ZHONG V,CHEN D,et al.Position-aware attention and supervised data improve slot filling[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing.ACL,2017:35-45. [67]ROBERTS A,GAIZAUSKAS R,HEPPLE M,et al.Semanticannotation of clinical text:The CLEF corpus[C]//American Medical Informatics Association Annual Symposium.AMIA,2008:19-26. [68]LI J,SUN Y,JOHNSON R J,et al.BioCreative V CDR task corpus:a resource for chemical disease relation extraction[J].Database,2016,2016:baw068. [69]WU Y,LUO R,LEUNG H C M,et al.RENET:A Deep Lear-ning Approach for Extracting Gene-Disease Associations from Literature[C]//Proceedings of the 23th International Confe-rence on Research in Computational Molecular Biology.Cham:Springer,2019:272-284. [70]JAIN S,ZUYLEN MVAN,HAJISHIRZI H,et al.SciREX:A Challenge Dataset for Document-Level Information Extraction[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.ACL,2020:7506-7516. [71]ZAPOROJETS K,DELEU J,DEVELDER C,et al.DWIE:an entity-centric dataset for multi-task document-level information extraction[J].Information Processing & Management,2021,58(4):102563. [72]NAYAK T,NG H T.A Hierarchical Entity Graph Convolu-tional Network for Relation Extraction across Documents[C]//Proceedings of the International Conference on Recent Advances in Natural Language Processing.RANLP,2021:1022-1030. |
|