计算机科学 ›› 2021, Vol. 48 ›› Issue (10): 77-84.doi: 10.11896/jsjkx.210300271

• 人工智能* 上一篇    下一篇

基于深度神经网络和自注意力机制的医学实体关系抽取

张世豪, 杜圣东, 贾真, 李天瑞   

  1. 西南交通大学计算机与人工智能学院 成都611756
  • 收稿日期:2021-03-29 修回日期:2021-05-21 出版日期:2021-10-15 发布日期:2021-10-18
  • 通讯作者: 李天瑞(trli@swjtu.edu.cn)
  • 作者简介:shihao_zura@163.com
  • 基金资助:
    四川省重点研发项目(2020YFG0035)

Medical Entity Relation Extraction Based on Deep Neural Network and Self-attention Mechanism

ZHANG Shi-hao, DU Sheng-dong, JIA Zhen, LI Tian-rui   

  1. School of Computing and Artificial Intelligence,Southwest Jiaotong University,Chengdu 611756,China
  • Received:2021-03-29 Revised:2021-05-21 Online:2021-10-15 Published:2021-10-18
  • About author:ZHANG Shi-hao,born in 1996,postgraduate.His main research interests include information extraction and natural language processing.
    LI Tian-rui,born in 1969,Ph.D,professor,Ph.D supervisor,is a distinguished member of China Computer Federation.His main research interests include big data intelligence,rough sets and granular computing.
  • Supported by:
    Sichuan Key R&D Project(2020YFG0035).

摘要: 随着医学信息化的推进,医学领域已经积累了海量的非结构化文本数据,如何从这些医学文本中挖掘出有价值的信息,是医学行业和自然语言处理领域的研究热点。随着深度学习的发展,深度神经网络被逐步应用到关系抽取任务中,其中“recurrent+CNN”网络框架成为了医学实体关系抽取任务中的主流模型。但由于医学文本存在实体分布密度较高、实体之间的关系交错互联等问题,使得 “recurrent+CNN”网络框架无法深入挖掘医学文本语句的语义特征。基于此,在“recurrent+CNN”网络框架基础之上,提出一种融合多通道自注意力机制的中文医学实体关系抽取模型,包括:1)利用BLSTM捕获文本句子的上下文信息;2)利用多通道自注意力机制深入挖掘句子的全局语义特征;3)利用CNN捕获句子的局部短语特征。通过在中文医学文本数据集上进行实验,验证了该模型的有效性,其精确率、召回率和F1值与主流的模型相比均有提高。

关键词: 多通道自注意力, 深度学习, 实体关系抽取, 医学文本

Abstract: With the advancement of medical informatization,a large amount of unstructured text data has been accumulated in the medical field.How to mine valuable information from these medical texts is a research hotspot in the field of medical profession and natural language processing.With the development of deep learning,deep neural network is gradually applied to relation extraction task,and “recurrent+CNN” network framework has become the mainstream model in medical entity relation extraction task.However,due to the problems of high entity density and the cross-connection of relationships between entities in medical texts,the “recurrent+CNN” network framework cannot deeply mine the semantic features of medical texts.Based on the “recurrent+CNN” network framework,this paper proposes a Chinese medical entity relation extraction model with multi-channel self-attention mechanism.It includes that BLSTM is used to capture the context information of text sentences,a multi-channel self-attention mechanism is used to mine the global semantic features of sentences,and CNN is used to capture the local phrase features of sentences.The effectiveness of the model is verified by experiments on Chinese medical text dataset.The precision,recall and F1 value of the model are improved compared with the mainstream models.

Key words: Deep learning, Entity relation extraction, Medical text, Multi-channel self-attention

中图分类号: 

  • TP391
[1]GOLSHAN P N,DASHTI H R,AZIZI S,et al.A Study of Recent Contributions on Information Extraction[J].arXiv:1803.05667,2018.
[2]LIU Q,LI Y,DUAN H,et al.Knowledge Graph Construction Techniques[J].Journal of Computer Research and Development,2016,53:582-600.
[3]GRISHMAN R,SUNDHEIM B.Message understanding confe-rence-6:a brief history[C]//Proceedings of the 16th Conference on Computational Linguistics.New York:ACM Press,1996:466-471.
[4]UZUNER O,SOUTH B,SHEN S Y,et al.2010 i2b2/VA challenge on concepts,assertions,and relations in clinical text[J].Journal of the American Medical Informatics Association,2011,18(5):552-556.
[5]NING S M,TENG F,LI T R.Multi-Channel Self-AttentionMechanism for Relation Extraction in Clinical Records[J].Chinese Journal of Computers,2020,43(5):916-929.
[6]HAN X,GAO T Y,LIN Y K,et al.More data,more relations,more context and more openness:a Review and outlook for relation extraction[J].arXiv:2004.03186,2020.
[7]ZHAO S,GRISHMAN R.Extraction relations with integrated information using kernel methods[C]//Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics.Stroudsburg:ACL,2005:419-426.
[8]GUO X Y,HE T T,HU X H,et al.Chinese Named Entity Relation Extraction Based Syntactic and Semantic Features[J].Journal of Chinese Information Processing,2014,28(6):183-189.
[9]KAMBHATLA N.Combining lexical,syntactic,and semanticfeatures with maximum entropy models for extracting relation[C]//Proceedings of ACL on Interactive Poster and Demonstration Sessions.Stroudsburg:ACL,2004:22-26.
[10]ZHOU J.Chinese entity relation extraction based on conditional random fields model[J].Computer Engineering,2010,36(24):192-194.
[11]RINK B,HARABAGIU S,ROBERTS K.Automatic extraction of relations between medical concepts in clinical texts[J].Journal of the American Medical Informatics Association,2011,18(5):594-600.
[12]D'SOUZA J,NG V.Ensemble-Based Medical Relation Classification[C]//25th International Conference on Computational Linguistics.Dublin:COLING,2014:1682-1693.
[13]KIM S,LIU H,YEGANOVA L,et al.Extracting drug-drug interactions from literature using a rich feature-based linear kernel approach[J].Journal of Biomedical Informatics,2015,55(2):23-30.
[14]ZENG D J,LIU K,LAI S W,et al.Relation classification via convolutional deep neural network[C]//Proceedings of the 25th International Conference on Computational Linguistics.Stroudsburg:ACL,2014:2335-2344.
[15]ZHANG D X,WANG D.Relation classification via recurrent neural network[J].arXiv:1508.01006,2015.
[16]ZHANG S,ZHENG D Q,HU X C,et al.Bidirectional Long short-term memory networks for relation classification[C]//Proceedings of the 29th Pacific Asia Conference on Language,Information and Computation.Stroudsburg:ACL,2015:73-78.
[17]ZHU J Z,QIAO J Z,DAI X X,et al.Relation classification via target-concentrated attention CNNs[C]//International Confe-rence on Neural Information Processing.Berlin:Springer,2017:137-146.
[18]ZHOU P,SHI W,TIAN J,et al.Attention-based bidirectional long short-term memory networks for relation classification[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics.Stroudsburg:ACL,2016:207-212.
[19]LEE J,SEO S,CHOI Y S.Semantic relation classification via bidirectional LSTM networks with entity-aware attention using latent entity typing[J].arXiv:1901.08163,2019.
[20]CAI R,ZHANG X D,WANG H F.Bidirectional recurrent con-volutional neural network for relation classification[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics.Stroudsburg:ACL,2016:756-765.
[21]ZHANG X B,CHEN F C,HUANG R Y.A combination of RNN and CNN for attention-based relation classification[J].Procedia Computer Science,2018,131:911-917.
[22]TRAN V H,PHI V T,SHINDO H,et al.Relation Classification Using Segment-Level Attention-based CNN and Dependency-based RNN[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.Stroudsburg:ACL,2019:2793-2798.
[23]SAHU S,ANAND A,ORUGANTY K,et al.Relation extraction from clinical texts using domain invariant convolutional neural network[C]//Proceedings of the 15th Workshop on Biomedical Natural Language Processing.2016:206-215.
[24]ZHOU H W,LANG C K,LIU Z,et al.Knowledge-guided con-volutional networks for chemical-disease relation extraction[J].BMC Bioinformatics,2019,20(1):260-273.
[25]SAHU S,ANAND A.Drug-Drug Interaction Extraction from Biomedical Texts Using Long Short-Term Memory Network[J].Journal of Biomedical Informatics,2018,86:15-24.
[26]BAI T,WANG C,WANG Y,et al.A novel deep learning me-thod for extracting unspecific biomedical relation[J].Concurrency and Computation:Practice and Experience,2020,32:1-11.
[27]RAJ D,SAHU S,ANAND A.Learning local and global contexts using a convolutional recurrent network model for relation classification in biomedical text[C]//Proceedings of the 21st Conference on Computational Natural Language Learning.Vancouver:CoNLL,2017:311-321.
[28]HE B,GUAN Y,DAI R.Convolutional Gated Recurrent Units for Medical Relation Classification[C]//2018 IEEE InternationalConference on Bioinformatics and Biomedicine.New York:IEEE Press,2019:646-650.
[29]LIN Z,FENG M,SANTOS C N,et al.A structured self-attentive sentence embedding[J].arXiv:1703.03130,2017.
[1] 饶志双, 贾真, 张凡, 李天瑞.
基于Key-Value关联记忆网络的知识图谱问答方法
Key-Value Relational Memory Networks for Question Answering over Knowledge Graph
计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[2] 汤凌韬, 王迪, 张鲁飞, 刘盛云.
基于安全多方计算和差分隐私的联邦学习方案
Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy
计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108
[3] 徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺.
时序知识图谱表示学习
Temporal Knowledge Graph Representation Learning
计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204
[4] 王剑, 彭雨琦, 赵宇斐, 杨健.
基于深度学习的社交网络舆情信息抽取方法综述
Survey of Social Network Public Opinion Information Extraction Based on Deep Learning
计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099
[5] 郝志荣, 陈龙, 黄嘉成.
面向文本分类的类别区分式通用对抗攻击方法
Class Discriminative Universal Adversarial Attack for Text Classification
计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077
[6] 姜梦函, 李邵梅, 郑洪浩, 张建朋.
基于改进位置编码的谣言检测模型
Rumor Detection Model Based on Improved Position Embedding
计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046
[7] 孙奇, 吉根林, 张杰.
基于非局部注意力生成对抗网络的视频异常事件检测方法
Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection
计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[8] 胡艳羽, 赵龙, 董祥军.
一种用于癌症分类的两阶段深度特征选择提取算法
Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification
计算机科学, 2022, 49(7): 73-78. https://doi.org/10.11896/jsjkx.210500092
[9] 程成, 降爱莲.
基于多路径特征提取的实时语义分割方法
Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction
计算机科学, 2022, 49(7): 120-126. https://doi.org/10.11896/jsjkx.210500157
[10] 侯钰涛, 阿布都克力木·阿布力孜, 哈里旦木·阿布都克里木.
中文预训练模型研究进展
Advances in Chinese Pre-training Models
计算机科学, 2022, 49(7): 148-163. https://doi.org/10.11896/jsjkx.211200018
[11] 周慧, 施皓晨, 屠要峰, 黄圣君.
基于主动采样的深度鲁棒神经网络学习
Robust Deep Neural Network Learning Based on Active Sampling
计算机科学, 2022, 49(7): 164-169. https://doi.org/10.11896/jsjkx.210600044
[12] 苏丹宁, 曹桂涛, 王燕楠, 王宏, 任赫.
小样本雷达辐射源识别的深度学习方法综述
Survey of Deep Learning for Radar Emitter Identification Based on Small Sample
计算机科学, 2022, 49(7): 226-235. https://doi.org/10.11896/jsjkx.210600138
[13] 祝文韬, 兰先超, 罗唤霖, 岳彬, 汪洋.
改进Faster R-CNN的光学遥感飞机目标检测
Remote Sensing Aircraft Target Detection Based on Improved Faster R-CNN
计算机科学, 2022, 49(6A): 378-383. https://doi.org/10.11896/jsjkx.210300121
[14] 王建明, 陈响育, 杨自忠, 史晨阳, 张宇航, 钱正坤.
不同数据增强方法对模型识别精度的影响
Influence of Different Data Augmentation Methods on Model Recognition Accuracy
计算机科学, 2022, 49(6A): 418-423. https://doi.org/10.11896/jsjkx.210700210
[15] 毛典辉, 黄晖煜, 赵爽.
符合监管合规性的自动合成新闻检测方法研究
Study on Automatic Synthetic News Detection Method Complying with Regulatory Compliance
计算机科学, 2022, 49(6A): 523-530. https://doi.org/10.11896/jsjkx.210300083
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!