计算机科学 ›› 2023, Vol. 50 ›› Issue (7): 46-52.doi: 10.11896/jsjkx.230200216

• 数据库&大数据&数据科学 • 上一篇    下一篇

基于对比学习的疾病诊断预测算法

王明霞, 熊贇   

  1. 复旦大学计算机科学技术学院 上海 200433
    上海市数据科学重点实验室 上海 200433
  • 收稿日期:2023-02-28 修回日期:2023-04-17 出版日期:2023-07-15 发布日期:2023-07-05
  • 通讯作者: 熊贇(yunx@fudan.edu.cn)
  • 作者简介:(wangmx20@fudan.edu.cn)

Disease Diagnosis Prediction Algorithm Based on Contrastive Learning

WANG Mingxia, XIONG Yun   

  1. School of Computer Science,Fudan University,Shanghai 200433,ChinaShanghai Key Laboratory of Data Science,Shanghai 200433,China
  • Received:2023-02-28 Revised:2023-04-17 Online:2023-07-15 Published:2023-07-05
  • About author:WANG Mingxia,born in 1999,postgraduate.Her main research interests include big data and medical data mi-ning.XIONG Yun,born in 1980,Ph.D,professor,Ph.D supervisor,is a member of China Computer Federation.Her main research interests include data science and data mining.

摘要: 疾病诊断预测旨在利用电子健康数据建模疾病进展模式,预测患者未来的健康状况,其在辅助临床决策、医疗保健服务等领域得到广泛应用。为了进一步发掘就诊记录中有价值的信息,提出了一种基于对比学习的疾病诊断预测算法。对比学习通过衡量样本间相似度为模型提供自监督训练信号,提升模型的信息捕捉能力。所提算法通过对比训练挖掘相似患者之间的共性知识,增强模型学习患者表征的能力;为了捕获更加全面的共性信息,还进一步挖掘了目标患者相似群体的信息作为辅助信息刻画患者健康状态。在公开数据集上的实验结果表明,相比Retain,Dipole,LSAN和GRASP算法,所提算法在再入院预测任务的AUROC和AUPRC指标上分别提升2.9%和8.1%以上,在诊断预测任务的Recall@10和MAP@10指标上分别提升2.1%和1.8%以上。

关键词: 诊断预测, 深度学习, 对比学习, 聚类, 相似患者

Abstract: Disease diagnosis prediction aims to use electronic health data to model disease progression patterns and predict the future health status of patients,and is widely used in assisting clinical decision-making,healthcare services and other fields.In order to further explore the valuable information in the medical records,a disease diagnosis prediction algorithm based on contrastive learning is proposed.Contrastive learning provides self-supervised training signals for the model by measuring the similarity between samples,which can improve the information capture ability of the model.The proposed algorithm excavates the common knowledge between similar patients through contrastive training,and enhances the ability of the model to learn patient representations.In order to capture more comprehensive common information,the information of similar groups of the target patient is further explored as auxiliary information to characterize the health status of the target patient.Experimental results on the public dataset show that compared with the Retain,Dipole,LSAN and GRASP algorithms,the proposed algorithm improves AUROC and AUPRC of the readmission prediction task by more than 2.9% and 8.1% respectively,and Recall@10 and MAP@10 of the diagnosis prediction task by 2.1% and 1.8%,respectively.

Key words: Diagnosis prediction, Deep learning, Contrastive learning, Clustering, Similar patients

中图分类号: 

  • TP311
[1]LI Y J,ZHENG R L,YANG X M.Diagnosis and predictionmodel of coronary heart disease based on data mining technology[J].Medical Information,2020,33(24):14-17.
[2]ZHU X T,PANG C Y,ZHU H.Cardiovascular disease prediction model based on deep learning [J].Journal of Computer Applications,2021,41(S2):346-350.
[3]LI M,MA L Y,YAO Z.Study on an intelligent diagnosis prediction model based on deep neural network[J].Medical Information,2022,43(8):52-55,75.
[4]CHOI E,BAHADORI M T,KULAS J A,et al.Retain:An interpretable predictive model for healthcare using reverse time attention mechanism[C]//Proceedings of the 30th International Conference on Neural Information Processing Systems.2016:3512-3520.
[5]MA F,CHITTA R,ZHOU J,et al.Dipole:Diagnosis prediction in healthcare via attention-based bidirectional recurrent neural networks[C]//Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2017:1903-1911.
[6]XIAO C,MA T,DIENG A B,et al.Readmission prediction via deep contextual embedding of clinical concepts[J].PLOS ONE,2018,13(4):1-15.
[7]CHOI E,BAHADORI M T,SCHUETZ A,et al.Doctor AI:Predicting clinical events via recurrent neural networks[C]//Proceedings of the 1st Machine Learning for Healthcare Confe-rence.2016:301-318.
[8]BAYTAS I M,XIAO C,ZHANG X,et al.Patient subtyping via time-aware LSTM networks[C]//Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2017:65-74.
[9]KWON B C,CHOI M J,KIM J T,et al.RetainVis:Visual analytics with interpretable and interactive recurrent neural networks on electronic medical records [J].IEEE Transactions on Visualization and Computer Graphics,2019,25(1):299-309.
[10]BAI T,ZHANG S,EGLESTON B L,et al.Interpretable representation learning for healthcare via capturing disease progression through time[C]//Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2018:43-51.
[11]LUO J,YE M,XIAO C,et al.HiTANet:Hierarchical time-aware attention networks for risk prediction on electronic health records [C]//Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2020:647-656.
[12]MEN L,ILK N,TANG X,et al.Multi-disease predictionusing LSTM recurrent neural networks[J].Expert Systems with Applications,2021,177:114905.
[13]SUO Q,MA F,YUAN Y,et al.Personalized disease prediction using a CNN based similarity learning method[C]//2017 IEEE International Conference on Bioinformatics and Biomedicine(BIBM).2017:811-816.
[14]SUO Q,MA F,YUAN Y,et al.Deep patient similarity learning for personalized health care[J].IEEE Transactions on NanoBioscience,2018,17(3):219-227.
[15]ZHANG C,GAO X,MA L,et al.GRASP:Generic framework for health status representation learning based on incorporating knowledge from similar patients [C]//Proceedings of the AAAI Conference on Artificial Intelligence.2021:715-723.
[16]OEI R W,HSU W,LEE M L,et al.Using similar patients to predict complication in patients with diabetes,hypertension,and lipid disorder:a domain knowledge infused convolutional neural network approach[J].Journal of the American Medical Informatics Association,2022,30(2):273-281.
[17]LI Y,YANG D,GONG X.Patient similarity via medical attributed heterogeneous graph convolutional network[J].IAENG International Journal of Computer Science,2022,49(4):1152-1161.
[18]AN Y,LI R,CHEN X.MERGE:A multi-graph attentive representation learning framework integrating group information from similar patients[J].Computers in Biology and Medicine,2022,151:106245.
[19]ZHANG C,CHU X,MA L,et al.M3Care:Learning with mis-sing modalities in multimodal healthcare data[C]//Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining.KDD,2022:2418-2428.
[20]VAN DEN OORD A,LI Y,VINYALS O.Representation lear-ning with contrastive predictive coding[J].arXiv:,1807.03748,2018.
[21]LI J,ZHOU P,XIONG C,et al.Prototypical contrastive learning of unsupervised representations [J].arXiv:2005.04966,2020.
[22]PENG X,LONG G,SHEN T,et al.Self-attention enhanced patient journey understanding in healthcare system[C]//Joint European Conference on Machine Learning and Knowledge Disco-very in Databases.2020:719-735.
[23]YE M,LUO J,XIAO C,et al.LSAN:Modeling long-term dependencies and short-term correlations with hierarchical attention for risk prediction[C]//Proceedings of the 29th ACM International Conference on Information and Knowledge Management.2020:1753-1762.
[1] 李劭, 蒋方婷, 杨鑫岩, 梁刚.
基于双向图注意力网络的潜在热点话题谣言检测
Rumor Detection on Potential Hot Topics with Bi-directional Graph Attention Network
计算机科学, 2025, 52(3): 277-286 . https://doi.org/10.11896/jsjkx.240100204
[2] 韩林, 王一帆, 李嘉楠, 高伟.
一种基于TVM的自动调度搜索优化方法
Automatic Scheduling Search Optimization Method Based on TVM
计算机科学, 2025, 52(3): 268-276 . https://doi.org/10.11896/jsjkx.240100126
[3] 沈雅馨, 高利剑, 毛启容.
基于元学习的半监督声音事件检测方法
Semi-supervised Sound Event Detection Based on Meta Learning
计算机科学, 2025, 52(3): 222-230 . https://doi.org/10.11896/jsjkx.240100191
[4] 田青, 康陆禄, 周亮宇.
基于多原型重放和对齐的类增量无源域适应
Class-incremental Source-free Domain Adaptation Based on Multi-prototype Replay andAlignment
计算机科学, 2025, 52(3): 206-213 . https://doi.org/10.11896/jsjkx.240100166
[5] 王嫄, 霍鹏, 韩毅, 陈暾, 汪祥, 温辉.
基于深度学习的气象预报模型研究综述
Survey on Deep Learning-based Meteorological Forecasting Models
计算机科学, 2025, 52(3): 112-126 . https://doi.org/10.11896/jsjkx.240900095
[6] 钟悦, 谷杰铭.
基于注意力机制与对比损失的单视图草图三维重建
3D Reconstruction of Single-view Sketches Based on Attention Mechanism and Contrastive Loss
计算机科学, 2025, 52(3): 77-85 . https://doi.org/10.11896/jsjkx.240200102
[7] 王杰, 王创业, 谢九成, 高浩.
基于区域编码的可驱动头部虚拟化身重建算法
Animatable Head Avatar Reconstruction Algorithm Based on Region Encoding
计算机科学, 2025, 52(3): 50-57 . https://doi.org/10.11896/jsjkx.240200060
[8] 王涛, 白雪飞, 王文剑.
基于边缘增强的选择性特征融合肾癌三维CT图像分割
Selective Feature Fusion for 3D CT Image Segmentation of Renal Cancer Based on Edge Enhancement
计算机科学, 2025, 52(3): 41-49 . https://doi.org/10.11896/jsjkx.240300091
[9] 陈自刚, 潘鼎, 冷涛, 朱海华, 陈龙, 周由胜.
基于局部梯度平滑的解释鲁棒性对抗训练方法
Explanation Robustness Adversarial Training Method Based on Local Gradient Smoothing
计算机科学, 2025, 52(2): 374-379 . https://doi.org/10.11896/jsjkx.240400210
[10] 丁瑞阳, 孙磊, 戴乐育, 臧韦菲, 徐八一.
基于通用扰动的对抗网络流量生成方法
Generation Method for Adversarial Networks Traffic Based on Universal Perturbations
计算机科学, 2025, 52(2): 336-343 . https://doi.org/10.11896/jsjkx.240300031
[11] 孙锐, 王菲, 冯惠东, 张旭东, 高隽.
基于深度学习的人脸呈现攻击检测方法研究进展
Research Progress in Facial Presentation Attack Detection Methods Based on Deep Learning
计算机科学, 2025, 52(2): 323-335 . https://doi.org/10.11896/jsjkx.240200015
[12] 刘衍伦, 肖正, 聂振宇, 乐雨泉, 李肯立.
辅助判决的案情要素关联与证据提取
Case Element Association with Evidence Extraction for Adjudication Assistance
计算机科学, 2025, 52(2): 222-230 . https://doi.org/10.11896/jsjkx.240600081
[13] 辛永杰, 蔡江辉, 贺艳婷, 苏美红, 史晨辉, 杨海峰.
基于跨结构特征选择和图循环自适应学习的多视图聚类
Multi-view Clustering Based on Cross-structural Feature Selection and Graph Cycle AdaptiveLearning
计算机科学, 2025, 52(2): 145-157 . https://doi.org/10.11896/jsjkx.231100173
[14] 张曼静, 何玉林, 李旭, 黄哲学.
基于节点抽样的分布式二阶段聚类方法
Distributed Two-stage Clustering Method Based on Node Sampling
计算机科学, 2025, 52(2): 134-144 . https://doi.org/10.11896/jsjkx.240800040
[15] 袁野, 陈明, 吴安彪, 王一舒.
基于个性化PageRank和对比学习的图异常检测模型
Graph Anomaly Detection Model Based on Personalized PageRank and Contrastive Learning
计算机科学, 2025, 52(2): 80-90 . https://doi.org/10.11896/jsjkx.240200005
Viewed
Full text
61
HTML PDF
Just accepted Online first Issue Just accepted Online first Issue
0 0 0 0 0 61

  From Others local
  Times 1 60
  Rate 2% 98%

Abstract
443
Just accepted Online first Issue
0 0 443
  From local
  Times 443
  Rate 100%

Cited

Web of Science  Crossref   ScienceDirect  Search for Citations in Google Scholar >>
 
This page requires you have already subscribed to WoS.
  Shared   
  Discussed   
No Suggested Reading articles found!