Computer Science ›› 2021, Vol. 48 ›› Issue (4): 237-242.doi: 10.11896/jsjkx.200100036

• Artificial Intelligence • Previous Articles     Next Articles

Multi-granularity Medical Entity Recognition for Chinese Electronic Medical Records

ZHOU Xiao-jin1, XU Chen-ming2, RUAN Tong1   

  1. 1 School of Information Science and Engineering,East China University of Science and Technology,Shanghai 200237,China
    2 School of Science,East China University of Science and Technology,Shanghai 200237,China
  • Received:2020-06-24 Online:2021-04-15 Published:2021-04-09
  • About author:ZHOU Xiao-jin,born in 1996,postgra-duate,is a student member of China Computer Federation.His main research interests include natural language processingand information extraction.(
    RUAN Tong,born in 1973,professor,Ph.D supervisor,is a member of China Computer Federation.Her main research interests include text extraction,knowledge graph and data quality assessment.
  • Supported by:
    Major Special Project of Precision Medical Research(2018YFC0910500) and National Natural Science Foundation of China(61772201).

Abstract: In the existing named entity recognition task for Chinese clinical electronic medical records,the granularity of annotation is usually too fine or too coarse,and it is difficult to find actual application scenarios for the too thin annotation results while the too thick annotation results usually need complex post-processing steps to clarify the standard form and the semantic type of entities,so as to facilitate subsequent data mining applications.In order to simplify post-processing steps,9 kinds of fine-grained analytical entities are defined to explain coarse-grained entities according to characteristics of 7 common coarse-grained clinical entities.Besides,according to characteristics of multi-granularity entities,a multi granularity clinical entity recognition model based on multi-task learning and self-attention mechanism is proposed,and 5 000 texts containing multi-granular entities are annotated on real hospital electronic medical records to verify the model.Experiment results show that this model outperforms the mainstream sequence labeling model.In the task of coarse and fine granularity entity recognition,their F1 scores reach 92.88 and 85.48,respectively.

Key words: Electronic medical records, Multi-granularity named entity recognition, Multi-task learning

CLC Number: 

  • TP391
[1]HE B,DONG B,GUANY,et al.Building a comprehensive syntactic and semantic corpus of Chinese clinical texts[J].Journal of Biomedical Informatics,2017,69:203-217.
[2]FUKUDA K,TSUNODA T,TAMURA A,et al.Toward information extraction:identifying protein names from biological papers[C]//Pac Sympbiocomput.1998:707-718.
[3]FRIEDMAN C,ALDERSON P O,AUSTIN J H M,et al.Ageneral natural-language text processor for clinical radiology[J].Journal of the American Medical Informatics Association,1994,1(2):161-174.
[4]SONG M,YU H,HANW S.Developing a hybrid dictionary-based bio-entity recognition technique[J].BMC Medical Informatics and Decision Making,2015,15(1):S9.
[5]ZHAO S.Named entity recognition in biomedical texts using an HMM model[C]//Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications.Association for Computational Linguistics,2004:84-87.
[6]FINKEL J R,DINGARE S,NGUYEN H,et al.Exploiting context for biomedical entity recognition:from syntax to the web[C]//Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications(NLPBA/BioNLP).2004:91-94.
[7]SETTLES B.Biomedical named entity recognition using condi-tional random fields and rich feature sets[C]//Proceedings of the International Joint Workshop on Natural Language Proces-sing in Biomedicine and its Applications(NLPBA/BioNLP).2004:107-110.
[8]HUANG Z,XU W,YU K.Bidirectional LSTM-CRF models for sequence tagging[J].arXiv:1508.01991,2015.
[9]GRIDACH M.Character-level neural network for biomedicalnamed entity recognition[J].Journal of Biomedical Informatics,2017,70:85-91.
[10]DANG T H,LE H Q,NGUYEN T M,et al.D3NER:biomedical named entity recognition using CRF-biLSTM improved with fine-tuned embeddings of various linguistic information[J].Bioinformatics,2018,34(20):3539-3548.
[11]LIU J,CHEN S,HE Z,et al.Learning BLSTM-CRF with Multi-channel Attribute Embedding for Medical Information Extraction[C]//CCF International Conference on Natural Language Processing and Chinese Computing.Springer,Cham,2018:196-208.
[12]GIORGI J M,BADER G D.Transfer learning for biomedicalnamed entity recognition with neural networks[J].Bioinforma-tics,2018,34(23):4087-4094.
[13]QIU J,WANG Q,ZHOU Y,et al.Fast and Accurate Recognition of Chinese Clinical Named Entities with Residual Dilated Convolutions[C]//2018 IEEE International Conference on Bioinformatics and Biomedicine(BIBM).IEEE,2018:935-942.
[14]WANG Q,ZHOU Y,RUAN T,et al.Incorporating dictionaries into deep neural networks for the chinese clinical named entity recognition[J].Journal of Biomedical Informatics,2019,92:103-133.
[15]LUONG M T,LE Q V,SUTSKEVER I,et al.Multi-task se-quence to sequence learning[J].arXiv:1511.06114,2015.
[16]ZENG L,GAO D Q,RUAN T,et al.Analysis and marking of symptom composition based on CRF[J].Journal of East China University of Science and Technology(Natural Science Edition),2018(2):277-282.
[17]MIKOLOV T,CHEN K,CORRADO G,et al.Efficient estimation of word representations in vector space[J].arXiv:1301.3781,2013.
[18]MIKOLOV T,SUTSKEVER I,CHEN K,et al.Distributed rep-resentations of words and phrases and their compositionality[C]//Advances in Neural Information Processing Systems.2013:3111-3119.
[19]VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[C]//Advances in Neural Information Processing Systems.2017:5998-6008.
[20]HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778.
[21]MA X,HOVY E.End-to-end sequence labeling via bi-directional lstm-cnns-crf[J].arXiv:1603.01354,2016.
[22]ZHENG G,MUKHERJEE S,DONG X L,et al.OpenTag:Open attribute value extraction from product profiles[C]//Procee-dings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.ACM,2018:1049-1058.
[1] DU Li-jun, TANG Xi-lu, ZHOU Jiao, CHEN Yu-lan, CHENG Jian. Alzheimer's Disease Classification Method Based on Attention Mechanism and Multi-task Learning [J]. Computer Science, 2022, 49(6A): 60-65.
[2] ZHAO Kai, AN Wei-chao, ZHANG Xiao-yu, WANG Bin, ZHANG Shan, XIANG Jie. Intracerebral Hemorrhage Image Segmentation and Classification Based on Multi-taskLearning of Shared Shallow Parameters [J]. Computer Science, 2022, 49(4): 203-208.
[3] YANG Xiao-yu, YIN Kang-ning, HOU Shao-qi, DU Wen-yi, YIN Guang-qiang. Person Re-identification Based on Feature Location and Fusion [J]. Computer Science, 2022, 49(3): 170-178.
[4] SONG Long-ze, WAN Huai-yu, GUO Sheng-nan, LIN You-fang. Multi-task Spatial-Temporal Graph Convolutional Network for Taxi Idle Time Prediction [J]. Computer Science, 2021, 48(7): 112-117.
[5] LIU Xiao-long, HAN Fang, WANG Zhi-jie. Joint Question Answering Model Based on Knowledge Representation [J]. Computer Science, 2021, 48(6): 241-245.
[6] ZHANG Chun-yun, QU Hao, CUI Chao-ran, SUN Hao-liang, YIN Yi-long. Process Supervision Based Sequence Multi-task Method for Legal Judgement Prediction [J]. Computer Science, 2021, 48(3): 227-232.
[7] WANG Ti-shuang, LI Pei-feng, ZHU Qiao-ming. Chinese Implicit Discourse Relation Recognition Based on Data Augmentation [J]. Computer Science, 2021, 48(10): 85-90.
[8] PAN Zu-jiang, LIU Ning, ZHANG Wei, WANG Jian-yong. MTHAM:Multitask Disease Progression Modeling Based on Hierarchical Attention Mechanism [J]. Computer Science, 2020, 47(9): 185-189.
[9] ZHOU Zi-qin, YAN Hua. 3D Shape Recognition Based on Multi-task Learning with Limited Multi-view Data [J]. Computer Science, 2020, 47(4): 125-130.
[10] GENG Lei-lei, CUI Chao-ran, SHI Cheng, SHEN Zhen, YIN Yi-long, FENG Shi-hong. Social Image Tag and Group Joint Recommendation Based on Deep Multi-task Learning [J]. Computer Science, 2020, 47(12): 177-182.
[11] CHEN Xun-min, YE Shu-han, ZHAN Rui. Crowd Counting Model of Convolutional Neural Network Based on Multi-task Learning and Coarse to Fine [J]. Computer Science, 2020, 47(11A): 183-187.
[12] GAO Li-jian,MAO Qi-rong. Environment-assisted Multi-task Learning for Polyphonic Acoustic Event Detection [J]. Computer Science, 2020, 47(1): 159-164.
[13] WU Liang-qing, ZHANG Dong, LI Shou-shan, CHEN Ying. Multi-modal Emotion Recognition Approach Based on Multi-task Learning [J]. Computer Science, 2019, 46(11): 284-290.
[14] MENG Hao-hua LI Guo-zheng (School of Computer Engineering and Science, Shanghai University, Shanghai 200072, China). [J]. Computer Science, 2008, 35(10): 186-187.
Full text



No Suggested Reading articles found!