计算机科学 ›› 2024, Vol. 51 ›› Issue (10): 351-361.doi: 10.11896/jsjkx.230800111

• 人工智能 • 上一篇    下一篇

化学物质诱导疾病关系抽取:基于证据聚焦的图推理方法

周雪阳1,2, 傅启明1,2, 陈建平2,3, 陆悠1,2, 王蕴哲1,2   

  1. 1 苏州科技大学电子与信息工程学院 江苏 苏州 215009
    2 苏州科技大学江苏省建筑智慧节能重点实验室 江苏 苏州 215009
    3 苏州科技大学建筑与城市规划学院 江苏 苏州 215009
  • 收稿日期:2023-08-17 修回日期:2024-01-15 出版日期:2024-10-15 发布日期:2024-10-11
  • 通讯作者: 傅启明(fqm_1@mail.usts.edu.cn)
  • 作者简介:( 1213574782@qq.com)
  • 基金资助:
    国家重点研发计划(2020YFC2006602);国家自然科学基金(62102278,62072324);江苏省高等学校自然科学研究项目(21KJA520005);江苏省重点研发计划(BE2020026);江苏省研究生教育教学改革项目;江苏省研究生科研与实践创新计划项目(KYCX23_3321)

Chemical-induced Disease Relation Extraction:Graph Reasoning Method Based on Evidence Focusing

ZHOU Xueyang1,2, FU Qiming1,2, CHEN Jianping2,3, LU You1,2, WANG Yunzhe1,2   

  1. 1 Electronic and Information Engineering,Suzhou University of Science and Technology,Suzhou,Jiangsu 215009,China
    2 Jiangsu Province Key Laboratory of Intelligent Building Energy Efficiency,Suzhou University of Science and Technology,Suzhou,Jiangsu 215009,China
    3 School of Architecture and Urban Planning,Suzhou University of Science and Technology,Suzhou,Jiangsu 215009,China
  • Received:2023-08-17 Revised:2024-01-15 Online:2024-10-15 Published:2024-10-11
  • About author:ZHOU Xueyang,born in 1998,postgra-duate.His main research interests include natural language processing and biomedical information mining.
    FU Qiming,born in 1985,Ph.D,professor,is a member of CCF(No.23956M).His main research interests include reinforcement learning,deep learning and intelligent information processing.
  • Supported by:
    National Key R&D Program of China(2020YFC2006602),National Natural Science Foundation of China(62102278,62072324),University Natural Science Foundation of Jiangsu Province(21KJA520005),Primary Research and Development Plan of Jiangsu Province(BE2020026),Postgraduate Education Reform Project of Jiangsu Province and Postgraduate Research & Practice Innovation Program of Jiangsu Province(KYCX23_3321).

摘要: 针对现有方法在挖掘化学物质与疾病之间的相互作用关系时存在过多地关注全局信息而忽略少量的证据线索和局部提及交互的问题,提出了一种基于证据聚焦的提及水平文档级关系抽取方法(Evidence Focused Mention U-shaped Network,EF-MUnet)。该方法首先基于上下文感知策略建模提及特征,并利用二维卷积捕获邻近提及之间的局部交互;其次为避免无关上下文的干扰,提出两种证据聚焦策略ATT-EF和RL-EF,前者将相似度作为证据线索的衡量指标,后者基于强化学习利用延迟反馈无监督地学习最优证据提取策略;最后使用U-net网络捕获实体水平的全局特征,充分挖掘语义关系。实验结果表明,与已有方法相比,EF-MUnet在生物医学数据集CDR上的F1评价指标提升了9.7%,并且对于句间关系的抽取更具有优势。此外,在抽取药物突变相互作用的数据集DMI上,EF-MUnet也取得了最高98.6%的准确率,证明了它是一种有效的生物医学关系抽取方法并具有较好的泛化能力。

关键词: 关系抽取, 证据聚焦, 强化学习, 自注意力机制, 生物医学

Abstract: To address the problem of existing methods focusing too much on global information while neglecting a small amount of evidence clues and local mention interactions when mining the interaction between chemicals and diseases,a mention level document-level relation extraction method based on evidence focusing(EF-MUnet) is proposed.This method first models mention features based on context aware strategies and captures local interactions between adjacent mentions using two-dimensional convolution network.Secondly,to avoid irrelevant context interference,two evidence focusing strategies ATT-EF and RL-EF are proposed.The former uses similarity as a measure of evidence clues,while the latter uses reinforcement learning to unsupervised learn the optimal evidence extraction policy with the help of delayed feedback.Finally,U-net networks are used to capture global features at the entity level and fully explore semantic relationships.Experimental results show that compared with existing me-thods,EF-MUnet's F1 score improves by 9.7% on the biomedical dataset CDR,and it has more advantages in extracting inter-sentence relations.In addition,EF-MUnet achieves the highest accuracy of 98.6% on the dataset DMI for extracting interactions between drug and mutation,proving that it is an effective biomedical relation extraction method with good generalization ability.

Key words: Relation extraction, Evidence focusing, Reinforcement learning, Self-attention mechanism, Biomedicine

中图分类号: 

  • TP391.1
[1]HUANG Q,ZHU S,FENG Y,et al.Three sentences are all you need:Local path enhanced document relation extraction[C]//59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing.2021:998-1004.
[2]MA Y,WANG A,OKAZAKI N.DREEAM:Guiding Attention with Evidence for Improving Document-Level Relation Extraction[J].arXiv:2302.08675,2023.
[3]HUANG K,QI P,WANG G,et al.Entity and evidence guided document-level relation extraction[C]//Proceedings of the 6th Workshop on Representation Learning for NLP(RepL4NLP-2021).2021:307-315.
[4]LI J,SUN Y,JOHNSON R J,et al.BioCreative V CDR task corpus:a resource for chemical disease relation extraction [J].Database,2016;baw068.
[5]YE W,LUO R B,HENRY C M L,et al.Renet:A deep learning approach for extracting gene-disease associations from literature[C]//International Conference on Research in Computational Molecular Biology.Springer,2019:272-284.
[6]YAO Y,YE D M,LI P,et al.DocRED:A large-scale document-level relation extraction dataset[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.2019:764-777.
[7]SON S H,LEE N R,GEE M S,et al.Chemical Knockdown of Phosphorylated p38 Mitogen-Activated Protein Kinase(MAPK) as a Novel Approach for the Treatment of Alzheimer's Disease [J].ACS Central Science,2023,9(3):417-426.
[8]THAISRIVONGS D A,MORRIS,W J,SCOTT J D.Discovery and Chemical Development of Verubecestat,a BACE1 Inhibitor for the Treatment of Alzheimer's Disease [J].ACS Symposium Series,2018,1037:53-89.
[9]HUANG R G,LI X B,WANG Y Y,et al.Endocrine-disruptingchemicals and autoimmune diseases [J].Environmental Research,2023,231:116222.
[10]LOWE D M,O'BOYLE N M,SAYLE R A.Efficient chemical-disease identification and relationship extraction using Wikipedia to improve recall [J].Database,2016,2016:baw039.
[11]LI B,YE W,SHENG Z,et al.Graph enhanced dual attention network for document-level relation extraction[C]//28th International Conference on Computational Linguistics.2020:1551-1560.
[12]ZHANG Z,YU B,SHU X,et al.Document-level relation extraction with dual-tier heterogeneous graph[C]//28th International Conference on Computational Linguistics.2020:1630-1641.
[13]WANG D,HU W,CAO E,et al.Global-to-local neural networks for document-level relation extraction[C]//2020 Conference on Empirical Methods in Natural Language Processing.2020:3711-3721.
[14]ZENG S,XU R,CHANG B,et al.Double graph based reasoning for document-level relation extraction[C]//2020 Conference on Empirical Methods in Natural Language Processing,Proceedings of the Conference.2020:1630-1640.
[15]WANG X,KEHAI C,TIEJUN Z.Document-level relation extraction with reconstruction[C]//35th AAAI Conference on Artificial Intelligence.2021:14167-14175.
[16]NAN G,GUO Z,SEKULIĆ I,et al.Reasoning with latent struc-ture refinement for document-level relation extraction[C]//58th Annual Meeting of the Proceedings of the Conference. 2020:1546-1557.
[17]YE D,LIN Y,DU J,et al.Coreferential reasoning learning for language representation[C]//2020 Conference on Empirical Methods in Natural Language Processing. 2020:7170-7186.
[18]XU B,WANG Q,LYU Y,et al.Entity Structure Within and Throughout:Modeling Mention Dependencies for Document-Level Relation Extraction[C]//35th AAAI Conference on Artificial Intelligence.2021:14149-14157.
[19]HONG W,CHRISTFRIED F,ROB S,et al.Fine-tune Bert forDocRED with Two-step Process[J].arXiv:1909.11898,2019.
[20]ZHOU W,HUANG K,MA T,et al.Document-Level Relation Extraction with Adaptive Thresholding and Localized Context Pooling[C]//35th AAAI Conference on Artificial Intelligence.2021:14612-14620.
[21]ZHANG N Y,CHEN X XIE X,et al.Document-level relation extraction as semantic segmentation[C]//Proceedings of the 30th International Joint Conference on Artificial Intelligence.2021:3999-4006.
[22]ZHANG L,CHENG Y.A Densely Connected Criss-Cross Attention Network for Document-level Relation Extraction[J].arXiv:2203.13953,2022.
[23]XU W,CHEN K,MOU L,et al.Document-Level Relation Extraction with Sentences Importance Estimation and Focusing[C]//2022 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. 2022:2920-2929.
[24]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need[C]//Advances in Neural Information Processing Systems 30-Proceedings of the 2017 Conference.Long Beach,CA,United states,2017:5999-6009.
[25]SUTTON R S,BARTO A G.Reinforcement learning:An introduction [M].MIT press,2018.
[26]SUTTON R S,MCALLESTER D,SINGH S,et al.Policy gra-dient methods for reinforcement learning with function approximation[C]//Advances in Neural Information Processing Systems.1999.
[27]RIDNIK T,BEN-BARUCH E,ZAMIR,et al.Asymmetric loss for multi-label classification[C]//18th IEEE/CVF International Conference on Computer Vision.2021:82-91.
[28]GU J,QIAN L,ZHOU G.Chemical-induced disease relation extraction with various linguistic features [J].Database,2016,2016:baw042.
[29]PENG N,POON H,QUIRK C,et al.Cross-sentence n-ary relation extraction with graph lstms [J].Transactions of the Association for Computational Linguistics,2017,5:101-115.
[30]BELTAGY I,LO K,COHAN A.SciBERT:A pretrained language model for scientific text[C]//2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing.2019:3615-3620.
[31]VERGA P,STRUBELL E,MCCALLUM A.Simultaneouslyself-attending to all mentions for full-abstract biological relation extraction[C]//2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2018:872-884.
[32]ZHENG W,LIN H F,LI Z H,et al.An effective neural model extracting document level chemical-induced disease relations from biomedical literature [J].Journal of Biomedical Informa-tics,2018,83(2018):1-9.
[33]CHRISTOPOULOU F,MIWA M,ANANIADOU S.Connec-ting the dots:Document-level neural relation extraction with edge-oriented graphs[C]//2019 Conference on Empirical Me-thods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing.2019:4925-4936.
[34]TRAN H M,NGUYEN M T,NGUYEN T H.The dots have their values:exploiting the node-edge connections in graph-based neural models for document-level relation extraction[C]//ACL 2020:EMNLP 2020:Findings of the Association for Computational Linguistics.2020:4561-4567.
[35]LAI P T,LU Z.BERT-GT:cross-sentence n-ary relation extraction with BERT and Graph Transformer [J].Bioinformatics,2020,36(24):5678-5685.
[36]LI J,XU K,LI F,et al.MRN:A locally and globally mention-based reasoning network for document-level relation extraction[C]//ACL-IJCNLP 2021:Findings of the Association for Computational Linguistics.2021:1359-1370.
[37]LI Z G,LIN H F,SHEN C,et al.Document-level Chemical-induced Disease Relation Extraction via Cross Self-attention [J].Journal of Chinese Information Processing,2022,36(7):98-105.
[38]GIORGI J,BADER G D,WANG B.A sequence-to-sequence ap-proach for document-level relation extraction[C]//BioNLP 2022 @ACL 2022:Proceedings of the 21st Workshop on Biomedical Language Processing.2022:10-25.
[39]DUAN J Y,YANG X,WANG H,et al.Document level relationship extraction based on inter sentence information in graph attention convolutional networks [J].Computer Science,2023,50(S1):191-196.
[40]DONG Y,XU X.Relational distance and document-level con-trastive pre-training based relation extraction model [J].Pattern Recognition Letters,2023,167:132-140.
[41]WANG N,CHEN T,REN C,et al.Document-level relation extraction with multi-layer heterogeneous graph attention network [J].Engineering Applications of Artificial Intelligence,2023,123:106212.
[42]GUO Z,ZHANG Y,LU W.Attention guided graph convolu-tional networks for relation extraction[C]//57th Annual Mee-ting of the Association for Computational Linguistics.2019:241-251.
[43]DEVLIN J,CHANG M W,LEE K,et al.Bert:pre-training of deep bidirectional transformers for language understanding[C]//2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2019:4171-4186.
[44]PENG Y,YAN S,LU Z.Transfer learning in biomedical natural language processing:an evaluation of BERT and ELMo on ten benchmarking datasets[C]//18th SIGBioMed Workshop on Biomedical Natural Language Processing.2019:58-65.
[45]JIANG Y,ZHOU Y,TU K.Learning and evaluation of latent dependency forest models [J].Neural Computing and Applications,2019,31:6795-6805.
[46]ZHAO L,XU W,GAO S,et al.Cross-sentence N-ary re-lation classification using LSTMs on graph and sequence structures [J].Knowledge-Based Systems,2020,207:106266.
[47]ZHAO D,WANG J,LIN H,et al.Biomedical cross-sentence relation extraction via multihead attention and graph convolutional networks [J].Applied Soft Computing,2021,104:107230.
[48]LAI P T,LU Z.BERT-GT:cross-sentence n-ary relation extraction with BERT and Graph Transformer [J].Bioinformatics,2020,36(24):5678-5685.
[49]CHEN X,ZHANG M,XIONG S,et al.On the form of parsed sentences for relation extraction [J].Knowledge-Based Systems,2022,251:109184.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!