计算机科学 ›› 2020, Vol. 47 ›› Issue (3): 211-216.doi: 10.11896/jsjkx.190200259

• 人工智能 • 上一篇    下一篇

融入语言模型和注意力机制的临床电子病历命名实体识别

唐国强,高大启,阮彤,叶琪,王祺   

  1. (华东理工大学信息科学与工程学院 上海200237)
  • 收稿日期:2019-02-01 出版日期:2020-03-15 发布日期:2020-03-30
  • 通讯作者: 高大启(gaodaqi@ecust.edu.cn)
  • 基金资助:
    国家重点研发计划(2018YFC0910500)

Clinical Electronic Medical Record Named Entity Recognition Incorporating Language Model and Attention Mechanism

TANG Guo-qiang,GAO Da-qi,RUAN Tong,YE Qi,WANG Qi   

  1. (School of information Science and Engineering, East China University of Science and Technology, Shanghai 200237, China)
  • Received:2019-02-01 Online:2020-03-15 Published:2020-03-30
  • About author:TANG Guo-qiang,born in 1993,master.His main research interests include nature language processing and so on. GAO Da-qi,born in 1977,Ph.D,professor.His main research interests include pattern recognition and machine lear-ning.
  • Supported by:
    This work was supported by the National Key R&D Program of China (2018YFC0910500).

摘要: 临床电子病历命名实体识别(Clinical Named Entity Recognition,CNER)的主要任务是对给定的一组电子病历文档进行识别并抽取出与医学临床相关的命名实体,然后将它们归类到预先定义好的类别中,如疾病、症状、检查等实体。命名实体识别任务通常被看作一个序列标注问题。目前,深度学习方法已经被广泛应用于该任务并取得了非常好的效果。但其中大部分方法未能有效利用大量的未标注数据;并且目前使用的特征相对简单,未能深入捕捉病历文本自身的特征。针对这两个问题,文中提出一种融入语言模型和注意力机制的深度学习方法。该方法首先从未标注的临床医疗数据中训练字符向量和语言模型,然后利用标注数据来训练标注模型。具体地,将句子的向量表示送入一个双向门控循环网络(Bidirectional Gated Recurrent Units,BiGRU)和预训练好的语言模型,并将两部分的输出进行拼接。之后,将前一层的拼接向量输入另一个BiGRU和多头注意力(Multi-head Attention)模块。最后,将BiGRU和多头注意力模块的输出进行拼接并输入条件随机场(Conditional Randoin Field,CRF),预测全局最优的标签序列。通过利用语言模型特征和多头注意力机制,该方法在CCKS-2017 Shared Task2标准数据集上取得了良好的结果(F1值为91.34%)。

关键词: 多头注意力, 临床医学命名实体识别, 深度神经网络, 循环控制单元, 语言模型

Abstract: Clinical Named Entity Recognition (CNER) aims to identify and classify named entity such as diseases,symptoms,exams,etc.in electronic health records,which is a fundamental and crucial task for clinical and translational research.The task is regarded as a sequence labeling problem.In recent years,deep neural network methods achieve significant success in named entity recognition.However,most of these algorithms do not take full advantages of the large amount of unlabeled data,and ignore the further features from the text.This paper proposed a model which combines language model and multi-head attention.First,chara-cter embeddings and a language model are trained from unlabeled clinical texts.Then,the labeling model are trained from labeled clinical texts.In specific use,the vector representation of the sentence is sent to a BiGRU and a pre-trained language model.This paper further concatenate the output of BiGRU and the features of language model.Afterwards,the outputs are fed to another BiGRU and multi-head attention module.Finally,a CRF layer is employed to predict the label sequence.Experimental results show that the proposed method which takes advantages of language model from the text and multi-head attention mechanism gets 91.34% of F1-score on CCKS-2017 Task2 benchmark dataset.

Key words: Clinical named entity recognition, Deep neural network, GRU, Language model, Multi-head attention

中图分类号: 

  • TP391
[1]电子病历基本规范(试行)[J].中国社区医学,2010(1):13-14.
[2]GRIDACH M.Character-level neural network for biomedical named entity recognition[J].Journal of Biomedical Informatics,2017,70:85-91.
[3]HABIBI M,WEBER L,NEVES M,et al.Deep learning with word embeddings improves biomedical named entity recognition[J].Bioinformatics,2017,33(14):i37-i48.
[4]VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[C]∥Advances in Neural Information Processing Systems.2017:5998-6008.
[5]FRIEDMAN C,ALDERSON P O,AUSTIN J H,et al.A general natural-language text processor for clinical radiology[J].J Am Med Inform Assoc,1994,1(2):161-174.
[6]ZENG Q T,GORYACHEV S,WEISS S,et al.Extracting principal diagnosis,co-morbidity and smoking status for asthma research:evaluation of a natural language processing system[J].BMC medical Informatics and Decision Making,2006,6(1):30.
[7]SAVOVA G K,MASANZ J J,OGREN P V,et al.Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES):architecture,component evaluation and applications[J].Journal of the American Medical Informatics Association Jamia,2010,17(5):507.
[8]RINDFLESCH T C,TANABE L,WEINSTEIN J N,et al.EDGAR:Extraction of Drugs,Genes And Relations from the Biomedical Literature[M]∥Biocomputing 2000.2014.
[9]SONG M,YU H,HAN W S.Developing a hybrid dictionary- based bio-entity recognition technique[J].BMC Medical Informatics and Decision Making,2015,15(1):S9.
[10]LEI J,TANG B,LU X,et al.A comprehensive study of named entity recognition in Chinese clinical text[J].Journal of the American Medical Informatics Association,2014,21(5):808-814.
[11]SETTLES B.Biomedical named entity recognition using conditional random fields and rich feature sets[C]∥Proceedings of the International Joint Workshop on Natural Language Proces-sing in Biomedicine and Its Applications.Association for Computational Linguistics,2004:104-107.
[12]SKEPPSTEDT M,KVIST M,NILSSON G H,et al.Automatic recognition of disorders,findings,pharmaceuticals and body structures from clinical text:An annotation and machine lear-ning study[J].Journal of Biomedical Informatics,2014,49:148-158.
[1] 高捷, 刘沙, 黄则强, 郑天宇, 刘鑫, 漆锋滨.
基于国产众核处理器的深度神经网络算子加速库优化
Deep Neural Network Operator Acceleration Library Optimization Based on Domestic Many-core Processor
计算机科学, 2022, 49(5): 355-362. https://doi.org/10.11896/jsjkx.210500226
[2] 焦翔, 魏祥麟, 薛羽, 王超, 段强.
基于深度学习的自动调制识别研究
Automatic Modulation Recognition Based on Deep Learning
计算机科学, 2022, 49(5): 266-278. https://doi.org/10.11896/jsjkx.211000085
[3] 肖丁, 张玙璠, 纪厚业.
基于多头注意力机制的用户窃电行为检测
Electricity Theft Detection Based on Multi-head Attention Mechanism
计算机科学, 2022, 49(1): 140-145. https://doi.org/10.11896/jsjkx.210100177
[4] 范红杰, 李雪冬, 叶松涛.
面向电子病历语义解析的疾病辅助诊断方法
Aided Disease Diagnosis Method for EMR Semantic Analysis
计算机科学, 2022, 49(1): 153-158. https://doi.org/10.11896/jsjkx.201100125
[5] 潘芳, 张会兵, 董俊超, 首照宇.
基于高效Transformer的中文在线课程评论方面情感分析
Aspect Sentiment Analysis of Chinese Online Course Review Based on Efficient Transformer
计算机科学, 2021, 48(6A): 264-269. https://doi.org/10.11896/jsjkx.200800116
[6] 周欣, 刘硕迪, 潘薇, 陈媛媛.
自然交通场景中的车辆颜色识别
Vehicle Color Recognition in Natural Traffic Scene
计算机科学, 2021, 48(6A): 15-20. https://doi.org/10.11896/jsjkx.200800078
[7] 丁玲, 向阳.
基于分层次多粒度语义融合的中文事件检测
Chinese Event Detection with Hierarchical and Multi-granularity Semantic Fusion
计算机科学, 2021, 48(5): 202-208. https://doi.org/10.11896/jsjkx.200800038
[8] 刘东, 王叶斐, 林建平, 马海川, 杨闰宇.
端到端优化的图像压缩技术进展
Advances in End-to-End Optimized Image Compression Technologies
计算机科学, 2021, 48(3): 1-8. https://doi.org/10.11896/jsjkx.201100134
[9] 张栋, 陈文亮.
基于上下文相关字向量的中文命名实体识别
Chinese Named Entity Recognition Based on Contextualized Char Embeddings
计算机科学, 2021, 48(3): 233-238. https://doi.org/10.11896/jsjkx.191200074
[10] 马琳, 王云霄, 赵丽娜, 韩兴旺, 倪金超, 张婕.
基于多模型判别的网络入侵检测系统
Network Intrusion Detection System Based on Multi-model Ensemble
计算机科学, 2021, 48(11A): 592-596. https://doi.org/10.11896/jsjkx.201100170
[11] 潘雨, 邹军华, 王帅辉, 胡谷雨, 潘志松.
基于网络表示学习的深度社团发现方法
Deep Community Detection Algorithm Based on Network Representation Learning
计算机科学, 2021, 48(11A): 198-203. https://doi.org/10.11896/jsjkx.210200113
[12] 邹傲, 郝文宁, 靳大尉, 陈刚, 田媛.
基于预训练和深度哈希的大规模文本检索研究
Study on Text Retrieval Based on Pre-training and Deep Hash
计算机科学, 2021, 48(11): 300-306. https://doi.org/10.11896/jsjkx.210300266
[13] 刘天星, 李伟, 许铮, 张立华, 戚骁亚, 甘中学.
面向高维连续行动空间的蒙特卡罗树搜索算法
Monte Carlo Tree Search for High-dimensional Continuous Control Space
计算机科学, 2021, 48(10): 30-36. https://doi.org/10.11896/jsjkx.201000129
[14] 王瑞平, 贾真, 刘畅, 陈泽威, 李天瑞.
基于DeepFM的深度兴趣因子分解机网络
Deep Interest Factorization Machine Network Based on DeepFM
计算机科学, 2021, 48(1): 226-232. https://doi.org/10.11896/jsjkx.191200098
[15] 张艳梅, 楼胤成.
基于深度神经网络的庞氏骗局合约检测方法
Deep Neural Network Based Ponzi Scheme Contract Detection Method
计算机科学, 2021, 48(1): 273-279. https://doi.org/10.11896/jsjkx.191100020
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!