Computer Science ›› 2022, Vol. 49 ›› Issue (6A): 32-38.doi: 10.11896/jsjkx.210400198

New Text Retrieval Model of Chinese Electronic Medical Records

YU Jia-qi1,2, KANG Xiao-dong1, BAI Cheng-cheng1, LIU Han-qing1   

  1. 1 School of Medical Image,Tianjin Medical University,Tianjin 300202,China
    2 Clinical Medical College of Tianjin Medical University,Tianjin 300270,China
  • Online:2022-06-10 Published:2022-06-08
  • About author:YU Jia-qi,born in 1989,postgraduate.Her main research interests include medical information processing and so on.
    KANG Xiao-dong,born in 1964,Ph.D,professor,postgraduate supervisor,is a member of China Computer Federation.His main research interests include medical image processing and medical information system integration.

Abstract: The growth of electronic medical records forms the basis of user health big data,which can improve the quality of medi-cal services and reduce medical costs.Therefore,the rapid and effective retrieval of cases has practical significance in clinical medi-cine.Electronic medical records have strong professionalism and unique text characteristics.However,traditional text retrieval methods have the disadvantages of inaccurate text entity semantic expression and low retrieval accuracy.In view of the above characteristics and problems,this paper proposes a fusion BERT-BiLSTM model structure to fully express the semantic information of the electronic medical record text and improve the accuracy of retrieval.This research is based on public data.First,correlation extension retrieval keywords prerpocessing is carried on the open standard Chinese EMR data according to clinical diagnosis rules.Secondly,the BERT model is used to dynamically obtain the word granularity vector matrix according to the context of the medical record text,then the generated word vector is used as the input of the bidirectional long and short-term memory network model(BiLSTM) to extract the global semantic features of the context information.Finally,the feature vector of the retrieved document is mapped to the Euclidean space,and the medical record text closest to the retrieved document is found to realize the text retrieval of unstructured clinical data.Simulation results show that this method can dig out multi-level and multi-angle text semantic features from the medical record text,the F1 value obtained on the electronic medical record data set is 0.94,which can significantly improve the accuracy of text semantic retrieval.

Key words: BERT model, Bidirectional long and short-term memory network model, Electronic medical record, Extended search keywords, Text retrieval

CLC Number: 

  • TP391
