文档级关系抽取技术研究综述

doi:10.11896/jsjkx.220400252

Abstract

Abstract: Relation extraction(RE) is an essential direction of information extraction research,it gradually expanding from sentence to document-level.Compared with sentences,documents usually contain more relation facts,providing more information for knowledge base construction,information retrieval,and semantic analysis.However,document-level relation extraction is more complex and challenging,and there is currently a lack of systematic and comprehensive sorting and summary.To better promote the development of document-level relation extraction,this paper carries out a comprehensive and in-depth analysis of the existing technologies and methods.From the perspective of data preprocessing methods and core algorithms,it classifies the existing methods into three types,including tree-based,sequence-based,and graph-based.Based on this,Relation extraction by category analyzes and describes some typical methods,the latest progress and shortcomings.At the same time,it introduces some corpus,evaluation metrics and some typical methods.Finally,the existing problems in document-level relation extraction research are analyzed and summarized,and the possible future development trends and research directions are discussed.

Key words: Information extraction, Document-level relation extraction, Data preprocess, Datasets, Performance evaluation

CLC Number:

TP391

ZHU Taojie, LU Jicang, ZHOU Gang, DING Xiaoyao, WANG Ling, ZHU Xiubao. Review of Document-level Relation Extraction Techniques[J].Computer Science, 2023, 50(5): 189-200.

References

[1]KADRY A,DIETZ L.Open relation extraction for support passage retrieval:Merit and open issues[C]//Proceedings of the 40th International Conference on Research and Development in Information Retrieval.ACM,2017:1149-1152.
[2]MO Y,YIN W,HASAN K S,et al.Improved Neural RelationDetection for Knowledge Base Question Answering[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.ACL,2017:571-581.
[3]YOUNG T,CAMBRIA E,CHATURVEDI I,et al.Augmenting End-to-End Dialog Systems with Commonsense Knowledge[C]//Proceedings of the 32th AAAI Conference on Artificial Intelligence.AAAI Press,2018:4970-4977.
[4]YAO Y,YE D,LI P,et al.DocRED:A Large-Scale Document-Level Relation Extraction Dataset[C]//Proceedings of the 57th Conference of the Association for Computational Linguistics.ACL,2019:764-777.
[5]CHENG Q,LIU J,QU X,et al.HacRED:A Large-Scale Relation Extraction Dataset Toward Hard Cases in Practical Applications[C]//Proceedings of the Association for Computational Linguistics.ACL,2021:2819-2831.
[6]LI D M,ZHANG Y,LI D Y,et al.Review of Entity RelationExtraction Methods[J].Journal of Computer Research and Development,2020,57(7):25.
[7]E H Y,ZHANG W J,XIAO S Q,et al.Survey of Entity Relationship Extraction Based on Deep Learning[J].Journal of Software,2019,30(6):26.
[8]HIRANO T,ASANO H,MATSUO Y,et al.Recognizing Relation Expression between Named Entities based on Inherent and Context-dependent Features of Relational words[C]//Procee-dings of the 23th International Conference on Computational Linguistics.COLING,2010:409-417.
[9]GUPTA P,RAJARAM S,SCHUTZE H,et al.Neural Relation Extraction Within and Across Sentence Boundaries[C]//Proceedings of the 32th Conference on Artificial Intelligence.AAAI Press,2019:6513-6520.
[10]TANG H,CAO Y,ZHANG Z,et al.HIN:Hierarchical Infe-rence Network for Document-Level Relation Extraction[C]//Proceedings of the 24th Pacific-Asia Conference.Cham:Sprin-ger,2020:197-209.
[11]LI J,XU K,LI F et al,MRN:A Locally and Globally Mention-Based Reasoning Network for Document-Level Relation Extraction[C]//Proceedings of the Association for Computational Linguistics.ACL/IJCNLP,2021:1359-1370.
[12]HUANG Q,ZHU S,FENG Y,et al.Three Sentences Are All You Need:Local Path Enhanced Document Relation Extraction[C]//Proceedings of the 59th Association for Computational Linguistics.ACL,2021:998-1004.
[13]JIA R,WONG C,POON H.Document-Level N-ary RelationExtraction with Multiscale Representation Learning[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics.ACL,2019:3693-3704.
[14]ZHANG N,CHEN X,XIE X,et al.Document-level RelationExtraction as Semantic Segmentation[C]//Proceedings of the 30th International Joint Conference on Artificial Intelligence.IJCAI,2021:3999-4006.
[15]XU B,WANG Q,LU Y Y,et al.Entity Structure Within and Throughout:Modeling Mention Dependencies for Document-Level Relation Extraction[C]//Proceedings of the 33th AAAI Conference on Artificial Intelligence.AAAI Press,2021:14149-14157.
[16]XIE Y,SHEN J,LI S,et al.Eider:Evidence-enhanced Docu-ment-level Relation Extraction[J].arXiv:2106.08657,2021.
[17]WANG H,FOCKE C,SYLVESTER R,et al.Fine-tune Bert for DocRED with Two-step Process[J].arXiv:1909.11898,2019.
[18]HUANG K,WANG G,MA T,et al.Entity and EvidenceGuided Relation Extraction for DocRED[J].arXiv:2008.12283,2020.
[19]LEE J,YOON W,KINM S,et al.BioBERT:a pre-trained biomedical language representation model for biomedical text mi-ning[J].Bioinformatics,2019,36(4):1234-1240.
[20]QIN Y,LIN Y,TAKANOBU R,et al.ERICA:Improving Entity and Relation Understanding for Pre-trained Language Models via Contrastive Learning[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing.ACL,2021:3350-3363.
[21]XIAO Y,ZHANG Z,MAO Y,et al.SAIS:Supervising and Aug-menting Intermediate Steps for Document-Level Relation Extraction[J].arXiv:2109.12093,2021.
[22]ZHENG W,LIU X,LIU X,et al.An effective neural model extracting document level chemical-induced disease relations from biomedical literature[J].Journal of Biomedical Informatics,2018,83:1-9.
[23]NAN G,GUO Z,SEKULIC I,et al.Reasoning with LatentStructure Refinement for Document-Level Realtion Extraction[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.ACL,2020:1546-1557.
[24]ZHOU H,XU Y,YAO W,et al.Global Context-enhancedGraph Convolutional Networks for Document-level Relation Extraction[C]//Proceedings of the 28th International Conference on Computational Linguistics.COLING,2020:5259-5270.
[25]DAI D,REN J,ZENG S,et al.Coarse-to-Fine Entity Representations for Document-level Relation Extraction[J].arXiv:2012.02507,2020.
[26]WANG D,HU W,CAO E,et al.Global-to-Local Neural Networks for Document-Level Relation Extraction[C]//Procee-dings of the 2020 Conference on Empirical Methods in Natural Language Processing.ACL,2020:3711-3721.
[27]LI B,YE W,SHENG Z,et al.Graph Enhanced Dual Attention Network for Document-Level Relation Extraction[C]//Proceedings of the 28th International Conference on Computational Linguistics.COLING,2020:1551-1560.
[28]ZHANG Z,YU B,SHU X,et al.Document-level Relation Extraction with Dual-tier Heterogeneous Graph[C]//Proceedings of the 28th International Conference on Computational Linguistics.COLING,2020:1630-1641.
[30]SHI Y,XIAO Y,QUAN P,et al.Document-level relation ex-traction via graph transformer networks and temporal convolutional networks[J].Pattern Recognition Letters,2021,149:150-156.
[31]MAKINO,K,MAKOTO M,YUTAKA S.A Neural Edge-Editing Approach for Document-Level Relation Graph Extraction[C]//Proceedings of the Association for Computational Linguistics.ACL,2021:2653-2662.
[32]XU W,CHEN K,ZHAO T.Discriminative Reasoning for Document-level Relation Extraction[C]//Proceedings of the Asso-ciation for Computational Linguistics.ACL,2021:1653-1663.
[33]SWAMPILLAI K,STEVENSON M.Extracting Relations With-in and Across Sentences[C]//Recent Advances in Natural Language Processing.RANLP,2011:25-32.
[34]BORDES A,USUNIER N,GARCIA-DURAN A,et al.Translating Embeddings for Modeling Multi-relational Data[C]//Proceedings of the 27th Advances in Neural Information Processing Systems.NIPS,2013:2787-2795.
[35]CHEN Q,ZHU X,LING Z,et al.Enhanced LSTM for Natural Language Inference[C]//Proceedings of the 55th Association for Computational Linguistics.ACL,2017:1657-1668.
[36]ZENG D,LIU K,LAI S,et al.Relation classification via convolutional deep neural network[C]//Proceedings of the 25th International Conference on Computational Linguistics.ACL,2014:2335-2344.
[37]RONNEBERGER O,FIS-CHER P,BROX T.U-net:Convolutional networks for biomedical image segmentation[C]//Proceedings of the 18th International Conference Munich.Cham:Springer,2015:234-241.
[38]VERGA P,STRUBELL E,MCCALLUM A.SimultaneouslySelf-Attending to All Mentions for Full-Abstract Biological Relation Extraction[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics.ACL,2018:872-884.
[39]YAN L,HAN X,SUN L,et al.From Bag of Sentences to Document:Distantly Supervised Relation Extraction via Machine Reading Comprehension[J].arXiv:2012.04334,2020.
[40]LI B,YE W,HUANG C,et al.Multi-view Inference for Relation Extraction with Uncertain Knowledge[C]//Proceedings of the 35th Conference on Artificial Intelligence.AAAI Press,2021:13234-13242.
[41]WU W,LI H,WANG H,et al.Probase:A probabilistic taxonomy for text understanding[C]//Proceedings of the International Conference on Management of Data.ACM,2012:481-492.
[42]ZHOU W,HUANG K,MA T,et al.Document-Level Relation Extraction with Adaptive Thresholding and Localized Context Pooling[C]//Proceedings of the 33th Conference on Artificial Intelligence.AAAI Press,2021:14612-14620.
[43]YUAN C,HUANG H,FENG C,et al.Document-level relationextraction with Entity-Selection Attention[J].Information Sciences,2021(568):163-174.
[44]XIAO C,YAO Y,XIE R,et al.Denoising Relation Extraction from Document-Level Distant Supervision[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing.ACL,2020:3683-3688.
[45]KIM Y,DENTON C,HOANG L,et al.Structured attention networks[C]//Proceedings of the 5th International Conference on Learning Representations.ICLR,2017.
[46]TERRY K,AMIR G,XAVIER C,et al.Structured prediction models via the matrix-tree theorem[C]//Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Lear-ning.ACL,2007:141-150.
[47]KIPF T,WELLING M.Semi-supervised classification withgraph convolutional networks[C]//Proceedings of the 5th International Conference on Learning Representations.ICLR,2017.
[48]GUO Z,ZHANG Y,LU W.Attention Guided Graph Convolutional Networks for Relation Extraction[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.ACL,2019:241-251.
[49]SAHU S K,CHRISTOPOULOU F,MIWA M,et al.Inter-sentence Relation Extraction with Document-level Graph Convolutional Neural Network[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.ACL,2019:4309-4316.
[50]ROBERTS A,GAIZAUSKAS R,HEPPLE M,et al.Semanticannotation of clinical text:The CLEF corpus[C]//Proceedings of the 2007 American Medical Informatics Association Annual Symposium.AMIA,2008:19-26.
[51]HUANG G,LIU Z,VAN DER MAATEN L,et al.Densely connected convolutional networks[C]//Proceedings of the 2017 Conference on Computer Vision and Pattern Recognition.IEEE,2017:4700-4708.
[52]VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[C]//Proceedings of 30th Advances in Neural Information Processing Systems.NIPS,2017:5998-6008.
[53]SCHLICHTKRULL M,KIPF T N,BLOEMP P,et al.Modeling relational data with graph convolutional networks[C]//Proceedings of the 15th European Semantic Web Conference.Cham:Springer,2018:593-607.
[54]PAN J,PENG M,ZHANG Y.Mention-centered Graph Neural Network for Document-level Relation Extraction[J].arXiv:2103.08200,2021.
[55]XU W,CHEN K,ZHAO T.Document-Level Relation Extraction with Reconstruction[C]//Proceedings of the 33th Confe-rence on Artificial Intelligence.AAAI Press,2020:14167-14175.
[56]WANG H,QIN K,LU G,et al.Document-level relation extraction using evidence reasoning on RST-GRAPH[J].Knowledge-Based Systems,2021(228):107274.
[57]MANN W C,THOMPSON S A.Rhetorical structure theory:Toward a functional theory of text organization[J].Text-interdisciplinary Journal for the Study of Discourse,1988,8(3):243-281.
[58]ZENG S,XU R,CHANG B,et al.Double Graph Based Reaso-ning for Document-level Relation Extraction[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing.ACL,2020:1630-1640.
[59]ZENG S,WU Y,CHANG B.Sire:Separate intra-and inter-sentential reasoning for document-level relation extraction[J].ar-Xiv:2106.01709,2021.
[60]PENG N,POON H,QUIRK C,et al.Cross-Sentence N-ary Relation Extraction with Graph LSTMs[J].Transactions of the Association for Computational Linguistics,2017,5(1):101-115.
[29]CHRISTOPOULOU F,MIWA M,ANANIADOU S.Connec-ting the Dots:Document-level Neural Relation Extraction with Edge-oriented Graphs[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing.ACL,2019:4924-4935.
[61]CHRISTOPOULOU F,MIWA M,ANANIADOU S.A walk-based model on entity graphs for relation extraction[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics.ACL,2018:81-88.
[62]BAI S,KOLTER J Z,KOLTUN V.An empirical evaluation of generic convolutional and recurrent networks for sequence mo-deling[J].arXiv:1803.01271,2018.
[63]YUN S,JEONG M,KIN R,et al.Graph transformer networks[C]//NeurIPS2019.2019:11960-11970.
[64]BELTAGY I,PETERS M E,COHAN A.Longformer:Thelong-document transformer[J].arXiv:2004.05150,2020.
[65]GRISHAMN R,SUNDHEIM B.Message Understanding Conference-6:a brief history[C]//Proceedings of the 16th Confe-rence on Computational Linguistics.ACL,1996:466-471.
[66]ZHANG Y,ZHONG V,CHEN D,et al.Position-aware attention and supervised data improve slot filling[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing.ACL,2017:35-45.
[67]ROBERTS A,GAIZAUSKAS R,HEPPLE M,et al.Semanticannotation of clinical text:The CLEF corpus[C]//American Medical Informatics Association Annual Symposium.AMIA,2008:19-26.
[68]LI J,SUN Y,JOHNSON R J,et al.BioCreative V CDR task corpus:a resource for chemical disease relation extraction[J].Database,2016,2016:baw068.
[69]WU Y,LUO R,LEUNG H C M,et al.RENET:A Deep Lear-ning Approach for Extracting Gene-Disease Associations from Literature[C]//Proceedings of the 23th International Confe-rence on Research in Computational Molecular Biology.Cham:Springer,2019:272-284.
[70]JAIN S,ZUYLEN MVAN,HAJISHIRZI H,et al.SciREX:A Challenge Dataset for Document-Level Information Extraction[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.ACL,2020:7506-7516.
[71]ZAPOROJETS K,DELEU J,DEVELDER C,et al.DWIE:an entity-centric dataset for multi-task document-level information extraction[J].Information Processing & Management,2021,58(4):102563.
[72]NAYAK T,NG H T.A Hierarchical Entity Graph Convolu-tional Network for Relation Extraction across Documents[C]//Proceedings of the International Conference on Recent Advances in Natural Language Processing.RANLP,2021:1022-1030.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Review of Document-level Relation Extraction Techniques

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0

[1]	DUAN Jianyong, YANG Xiao, WANG Hao, HE Li, LI Xin. Document-level Relation Extraction of Graph Attention Convolutional Network Based onInter-sentence Information [J]. Computer Science, 2023, 50(6A): 220800189-6.
[2]	SHEN Qiuping, ZHANG Qinghua, GAO Man, DAI Yongyang. Three-way DBSCAN Algorithm Based on Local Eps [J]. Computer Science, 2023, 50(6): 100-108.
[3]	HUANG Rongfeng, LIU Shifang, ZHAO Yonghua. Batched Eigenvalue Decomposition Algorithms for Hermitian Matrices on GPU [J]. Computer Science, 2023, 50(4): 397-403.
[4]	SUN Kaili, LUO Xudong , Michael Y.LUO. Survey of Applications of Pretrained Language Models [J]. Computer Science, 2023, 50(1): 176-184.
[5]	WANG Jian, PENG Yu-qi, ZHAO Yu-fei, YANG Jian. Survey of Social Network Public Opinion Information Extraction Based on Deep Learning [J]. Computer Science, 2022, 49(8): 279-293.
[6]	LI Xiao-wei, SHU Hui, GUANG Yan, ZHAI Yi, YANG Zi-ji. Survey of the Application of Natural Language Processing for Resume Analysis [J]. Computer Science, 2022, 49(6A): 66-73.
[7]	LIU Lin-yun, CHEN Kai-yan, LI Xiong-wei, ZHANG Yang, XIE Fang-fang. Overview of Side Channel Analysis Based on Convolutional Neural Network [J]. Computer Science, 2022, 49(5): 296-302.
[8]	ZHU Yi-na, CAO Yang, ZHONG Jing-yue, ZHENG Yong-zhi. Survey on Event Extraction Technology [J]. Computer Science, 2022, 49(12): 264-273.
[9]	ZHAO Chen-yang, ZHANG Hui, LIAO De, LI Chen. Rail Surface Defect Detection Model Based on Attention Module and Hybrid-supervised Learning [J]. Computer Science, 2022, 49(11A): 210800241-6.
[10]	SUN Fu-quan, ZOU Peng, CUI Zhi-qing, ZHANG Kun. Classification Algorithm of Diabetic Retinopathy Based on Attention Mechanism [J]. Computer Science, 2022, 49(11A): 211000213-5.
[11]	MIAO Lan-xin, LEI Yu, ZENG Peng-peng, LI Xiao-yu, SONG Jing-kuan. Granularity-aware and Semantic Aggregation Based Image-Text Retrieval Network [J]. Computer Science, 2022, 49(11): 134-140.
[12]	FENG Jun, WEI Da-bao, SU Dong, HANG Ting-ting, LU Jia-min. Survey of Document-level Entity Relation Extraction Methods [J]. Computer Science, 2022, 49(10): 224-242.
[13]	JIANG Hao-chen, WEI Zi-qi, LIU Lin, CHEN Jun. Imbalanced Data Classification:A Survey and Experiments in Medical Domain [J]. Computer Science, 2022, 49(1): 80-88.
[14]	HUANG Ying-qi, CHEN Hong-mei. Cost-sensitive Convolutional Neural Network Based Hybrid Method for Imbalanced Data Classification [J]. Computer Science, 2021, 48(9): 77-85.
[15]	DING Ling, XIANG Yang. Chinese Event Detection with Hierarchical and Multi-granularity Semantic Fusion [J]. Computer Science, 2021, 48(5): 202-208.