Computer Science ›› 2021, Vol. 48 ›› Issue (11A): 630-637.doi: 10.11896/jsjkx.210300070

• Interdiscipline & Application • Previous Articles     Next Articles

Diagnostic Prediction Based on Node2vec and Knowledge Attention Mechanisms

LI Hang, LI Wei-hua, CHEN Wei, YANG Xian-ming, ZENG Cheng   

  1. School of Information Science and Engineering,Yunnan University,Kunming 650504,China
  • Online:2021-11-10 Published:2021-11-12
  • About author:LI Hang,born in 1996,postgraduate.His main research interests include bioinformatics and machine learning.
    LI Wei-hua,born in 1977,Ph.D,asso-ciate professor.Her main research inte-rests include bioinformatics,data mi-ning and knowledge engineering.
  • Supported by:
    National Natural Science Foundation of China(32060151) and Scientific Research Fundation of the Education Department of Yunnan Province,China(2019J0006).

Abstract: Diagnostic prediction predicts the future diagnosis of patients from their historical health states,and it is the core task of personalized medical decisions.Electronic health record(EHR) documents patients' time-varying health conditions and clinical care,and also provides a wealth of longitudinal clinical data for diagnostic prediction.However,the existing diagnostic prediction models based on EHR can not completely learn the hidden disease progression patterns.Moreover,the performance of fine-grained diagnostic prediction greatly depends on more informative sequence features.In order to improve the performance,we propose adiagnostic prediction model,called Node2vec and knowledge attention model (NKAM).Specifically,based on Node2vec,the model captures the potential medical knowledge from the global structure of medical ontology.It also maps categories into low-dimensional vectors and encodes the medical knowledge of patients' health state into category embedding vectors.The diagnosis code embedding vectorsare used to enrich the patients' fine-grained health state representation.Then,the long-term dependencies and disease progression patterns can be extracted from the patient's historical health states using a knowledge attention mechanism combined with the Gated Recurrent Unit(GRU).Experimental results on real-world dataset show that NKAM significantly improves the prediction performance compared with state-of-the-art methods.Furthermore,the experiments reveal that Node2vec can capture more informative medical concept embedding from medical ontology,and the knowledge-based attention mechanism helps to the effective integration of external knowledge and electronic health records.

Key words: Diagnostic prediction, Electronic health record(EHR), Gated recurrent unit(GRU), Knowledge attention, Node2vec

CLC Number: 

  • TP391
[1]HÄYRINEN K,SARANTO K,NYKÄNEN P.Definition,structure,content,use and impacts of electronic health records:a review of the research literature[J].International Journal of Medical Informatics,2008,77(5):291-304.
[2]JENSEN P B,JENSEN L J,BRUNAK S.Mining electronichealth records:towards better research applications and clinical care[J].Nature Reviews Genetics,2012,13(6):395-405.
[3]YADAV P,STEINBACH M,KUMAR V,et al.Mining Elec-tronic Health Records (EHRs) A Survey[J].ACM Computing Surveys (CSUR),2018,50(6):1-40.
[4]SOLARES J R A,RAIMONDI F E D,ZHU Y,et al.Deep learning for electronic health records:A comparative review of multiple deep neural architectures[J].Journal of Biomedical Informatics,2020,101:103337.
[5]ABUL-HUSN N S,KENNY E E.Personalized medicine and the power of electronic health records[J].Cell,2019,177 (1):58-69.
[6]MURPHY K P.Book Machine learning:a probabilistic perspective [M] .Cambridge:MIT press,2012.
[7]SUN J,MCNAUGHTON C D,ZHANG P,et al.Predictingchanges in hypertension control using electronic health records from a chronic disease management program[J].Journal of the American Medical Informatics Association,2014,21(2):337-344.
[8]CHEN R,SU H,KHALILIA M,et al.Cloud-based predictive modeling system and its application to asthma readmission prediction[C]//AMIA Annual Symposium Proceedings.American Medical Informatics Association.2015:406.
[9]CHOI E,BAHADORI M T,SCHUETZ A,et al.Doctor ai:Predicting clinical events via recurrent neural networks[C]//Machine Learning for Healthcare Conference.2016:301-318.
[10]LEE W,PARK S,JOO W,et al.Diagnosis prediction via medical context attention networks using deep generative modeling[C]//2018 IEEE International Conference on Data Mining (ICDM).IEEE,2018:1104-1109.
[11]CHE Z,KALE D,LI W,et al.Deep computational phenotyping[C]//Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2015:507-516.
[12]BAYTAS I M,XIAO C,ZHANG X,et al.Patient subtyping via time-aware lstm networks[C]//Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2017:65-74.
[13]NING S M,TENG F,LI T R.Entity relationship extraction of electronic medical records based on multi-channel self-attention mechanism[J].Chinese Journal of Computers,2020,43(5):916-929.
[14]CHOI E,BAHADORI M T,SONG L,et al.GRAM:graph-based attention model for healthcare representation learning[C]//Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2017:787-795.
[15]CHE Z,CHENG Y,ZHAI S,et al.Boosting deep learning risk prediction with generative adversarial networks for electronic health records[C]//2017 IEEE International Conference on Data Mining (ICDM).IEEE,2017:787-792.
[16]MA F,GAO J,SUO Q,et al.Risk prediction on electronic health records with prior medical knowledge[C]//Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.2018:1910-1919.
[17]CHOI E,BAHADORI M T,SUN J,et al.Retain:An interpretable predictive model for healthcare using reverse time attention mechanism[C]//Advances in Neural Information Processing Systems.2016:3504-3512.
[18]MA F,CHITTA R,ZHOU J,et al.Dipole:Diagnosis prediction in healthcare via attention-based bidirectional recurrent neural networks[C]//Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2017:1903-1911.
[19]TROTT P.International classification of diseases for oncology[J].Journal of Clinical Pathology,1977,30 (8):782.
[20]COST H,PROJECT U.Clinical classifications software (CCS) for ICD-9-CM[OL].http://www hcup-us ahrq gov/toolssoftware/ccs/ccs jsp Accessed February,2018,27.
[21]STEARNS M Q,PRICE C,SPACKMAN K A,et al.SNOMED clinical terms:overview of the development process and project status[C]//Proceedings of the AMIA Symposium.American Medical Informatics Association.2001:662.
[22]BODENREIDER O.The unified medical language system(UMLS):integrating biomedical terminology[J].Nucleic acids research,2004,32 (suppl_1):D267-D270.
[23]LIPSCOMB C E.Medical subject headings (MeSH)[J].Bulletin of the Medical Library Association,2000,88(3):265.
[24]MA F,YOU Q,XIAO H,et al.Kame:Knowledge-based attention model for diagnosis prediction in healthcare[C]//Proceedings of the 27th ACM International Conference on Information and Knowledge Management.2018:743-752.
[25]TU C C,YANG C,LIU Z Y,et al.Overview of network representation learning[J].Scientia Sinica Informationis,2017,47(8):980-996.
[26]QI Z W,WANG J H,YUE K,et al.Graph embedding methods and applications:research review[J].Acta Electronica Sinica,2020,446(4):186-196.
[27]PEROZZI B,AL-RFOU R,SKIENA S.Deepwalk:Online learning of social representations[C]//Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2014:701-710.
[28]GROVER A,LESKOVEC J.node2vec:Scalable feature learning for networks[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2016:855-864.
[29]ZENG M,LI M,WU F X,et al.DeepEP:a deep learning framework for identifying essential proteins[J].BMC Bioinformatics,2019,20(16):506.
[30]ZENG M,LI M,FEI Z,et al.A deep learning framework for identifying essential proteins by integrating multiple types of biological information[J].IEEE/ACM Transactions on Computational Biology and Bioinformatics,2019,18(1):296-305.
[31]BARABASI A L,OLTVAI Z N.Network biology:understanding the cell's functional organization[J].Nature Reviews Gene-tics,2004,5(2):101-113.
[32]BARABÁSI A L,GULBAHCE N,LOSCALZO J.Networkmedicine:a network-based approach to human disease[J].Nature Reviews Genetics,2011,12(1):56-68.
[33]MIKOLOV T,CHEN K,CORRADO G,et al.Efficient estimation of word representations in vector space[J].arXiv:13013781,2013.
[34]HOCHREITER S,SCHMIDHUBER J.Long short-term memory[J].Neural Computation,1997,9(8):1735-1780.
[35]CHO K,VAN MERRIËNBOER B,BAHDANAU D,et al.On the properties of neural machine translation:Encoder-decoder approaches[J].arXiv:14091259,2014.
[36]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need[C]//Advances in Neural Information Processing Systems.2017:5998-6008.
[37]JOHNSON A E,POLLARD T J,SHEN L,et al.MIMIC-III,a freely accessible critical care database[J].Scientific Data,2016,3(1):1-9.
[38]KINGMA D P,BA J.Adam:A method for stochastic optimization[J].arXiv:14126980,2014.
[39]SRIVASTAVA N,HINTON G,KRIZHEVSKY A,et al.Dropout:a simple way to prevent neural networks from overfitting[J].The Journal of Machine Learning Research,2014,15(1):1929-1958.
[40]LI M,ZHANG T,CHEN Y,et al.Efficient mini-batch training for stochastic optimization[C]//Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2014:661-670.
[1] YU Ai-xin, FENG Xiu-fang, SUN Jing-yu. Social Trust Recommendation Algorithm Combining Item Similarity [J]. Computer Science, 2022, 49(5): 144-151.
[2] LIU Wen-yang, GUO Yan-bu, LI Wei-hua. Identifying Essential Proteins by Hybrid Deep Learning Model [J]. Computer Science, 2021, 48(8): 240-245.
[3] HE Jin-lin, LIU Xue-jun, XU Xin-yan, MAO Yu-jia. Implicit Feedback Recommendation Model Combining Node2vec and Deep Neural Networks [J]. Computer Science, 2019, 46(6): 41-48.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!