计算机科学 ›› 2023, Vol. 50 ›› Issue (10): 223-229.doi: 10.11896/jsjkx.220900108
文坤建, 陈艳平, 黄瑞章, 秦永彬
WEN Kunjian, CHEN Yanping, HUANG Ruizhang, QIN Yongbin
摘要: 在非结构化生物医学文本数据中提取出实体之间的关系,对生物医学的信息化发展有着重大意义,同时也是自然语言处理领域的研究热点。目前,在生物医学数据中正确地提取出实体间的关系面临着两个难点:1)由于在生物医学数据中实体单词大多由复合词、未知词组成,模型难以学习到实体内部的语义特征;2)由于生物医学带标注数据较少,而神经网络的参数量较大,使得神经网络容易过拟合。因此,文中提出了基于提示学习的生物医学关系抽取方法,增加了一种针对实体的注解标签,来对实体进行提示以达到实体语义增强以及联系上下文信息的目的。此外,在传统提示调优方法的基础上,文中使用连续性模板来缓解人工设计模板所带来的性能偏差,同时结合深度前缀控制attention的深度提示能力,使模型在处理较少数据的情况时仍能取得良好的效果。
中图分类号:
[1]WEXLER P.The U.S. National Library of Medicine's Toxico-logy and Environmental Health Information Program[J].Toxicology,2004,198(1/2/3):161-168. [2]LECUN Y,BENGIO Y,HINTON G.Deep learning[J].Nature,2015,521(7553):436-444. [3]KIM Y.Convolutional neural networks for sentence classification[J].arXiv:1408.5882,2014. [4]SUTSKEVER I,VINYALS O,LE Q V.Sequence to sequencelearning with neural networks[J/OL].Advances in Neural Information Processing Systems,2014,27.https://proceedings.neurips.cc/paper/2014/hash/a14ac55a4f27472c5d894ec1c3c743d2-Abstract.html. [5]BELKIN M,HSU D,MA S,et al.Reconciling modern machine-learning practice and the classical bias-variance trade-off[J].Proceedings of the National Academy of Sciences,2019,116(32):15849-15854. [6]SCHICK T,SCHÜTZE H.Exploiting cloze questions for few shot text classification and natural language inference[J].ar-Xiv:2001.07676,2020. [7]DEVLIN J,CHANG M W,LEE K,et al.Bert:Pre-training of deep bidirectional transformers for language understanding[J].arXiv:1810.04805,2018. [8]BLASCHKE C,ANDRADE M A,OUZOUNIS C A,et al.Automatic extraction of biological information from scientific text:protein-protein interactions[C]//ISMB.1999,7:60-67. [9]ONO T,HISHIGAKI H,TANIGAMI A,et al.Automated extraction of information on protein-protein interactions from thebiological literature[J].Bioinformatics,2001,17(2):155-161. [10]KAMBHATLA N.Combining lexical,syntactic,and semanticfeatures with maximum entropy models for information extraction[C]//Proceedings of the ACL Interactive Poster and De-monstration Sessions.2004:178-181. [11]BUNESCU R C,MOONEY R J.A shortest path dependency kernel for relation extraction[C]//Proceedings of the Confe-rence on Human Language Technology and Empirical Methods in Natural Language Processing.2005:724-731. [12]HOCHREITER S,SCHMIDHUBER J.Long short-term memory[J].Neural Computation,1997,9(8):1735-1780. [13]VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[J/OL].Advances in Neural Information Processing Systems,2017,30.https://proceedings.neurips.cc/paper_files/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html. [14]PETERS M E,NEUMANN M,IYYER M,et al.Deep contex-tualized word representations[J].arXiv:1802.05365,2018. [15]LI Y,CHEN Y,QIN Y,et al.Protein-protein interaction relation extraction based on multigranularity semantic fusion[J].Journal of Biomedical Informatics,2021,123:1532-0464. [16]HAN X,ZHAO W,DING N,et al.Ptr:Prompt tuning withrules for text classification[J].arXiv:2105.11259,2021. [17]GAO T,FISCH A,CHEN D.Making pre-trained language mo-dels better few-shot learners[J].arXiv:2012.15723,2020. [18]LIU X,JI K,FU Y,et al.P-Tuning v2:Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks[J].arXiv:2110.07602,2021. [19]LIU X,ZHENG Y,DU Z,et al.GPT understands,too[J].ar-Xiv:2103.10385,2021. [20]DING J,BERLEANT D,NETTLETON D,et al.Mining MEDLINE:abstracts,sentences,or phrases?[M]//Biocomputing 2002.2001:326-337. [21]FUNDEL K,KÜFFNER R,ZIMMER R.RelEx—Relation extraction using dependency parse trees[J].Bioinformatics,2007,23(3):365-371. [22]NÉDELLEC C.Learning language in logic-genic interaction extraction challenge[C]//4.Learning Language in Logic Workshop(LLL05).ACM-Association for Computing Machinery,2005. [23]LEE J,YOON W,KIM S,et al.BioBERT:a pre-trained biome-dical language representation model for biomedical text mining[J].Bioinformatics,2020,36(4):1234-1240. [24]ZHANG H,GUAN R,ZHOU F,et al.Deep residual convolutional neural network for protein-protein interaction extraction[J].IEEE Access,2019,7:89354-89365. [25]AHMED M,ISLAM J,SAMEE M R,et al.Identifying protein-protein interaction using tree lstm and structured attention[C]//2019 IEEE 13th International Conference on Semantic Computing(ICSC).IEEE,2019:224-231. [26]ZHANG Y,LIN H,YANG Z,et al.A hybrid model based on neural networks for biomedical relation extraction[J].Journal of Biomedical Informatics,2018,81:83-92. |
|