Computer Science ›› 2024, Vol. 51 ›› Issue (7): 80-88.doi: 10.11896/jsjkx.231000138
• Database & Big Data & Data Science • Previous Articles Next Articles
YANG Shasha, YU Yaxin, WANG Yueru, XU Jingming, WEI Yangjie, LI Xinhua
CLC Number:
[1]RIACHI E,MAMDANI M,FRALICK M,et al.Challenges for Reinforcement Learning in Healthcare[J].arXiv:2103.05612,2021. [2]CORONATO A,NAEEM M,DE PIETRO G,et al.Reinforcement learning for intelligent healthcare applications:A survey[J].Artificial Intelligence in Medicine,2020,109:101964. [3]YU C,LIU J,NEMATI S,et al.Reinforcement learning inhealthcare:A survey[J].ACM Computing Surveys(CSUR),2021,55(1):1-36. [4]MATTHIEU K,LEO A C,OMAR B,et al.The artificial intelligence clinician learns optimal treatment strategies for sepsis in intensive care[J].Nature Medicine,2018,24(11):1716. [5]RAGHU A,KOMOROWSKI M,AHMED I,et al.Deep rein-forcement learning for sepsis treatment[J].arXiv:1711.09602,2017. [6]WANG L,ZHANG W,HE X,et al.Supervised reinforcement learning with recurrent neural network for dynamic treatment recommendation[C]//Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mi-ning.2018:2447-2456. [7]KAUSHIK P,KUMMETHA S,MOODLEY P,et al.A conservative Q-learning approach for handling distribution shift in sepsis treatment strategies[J].arXiv:2203.13884,2022. [8]FUJIMOTO S,GUS S.A minimalist approach to offline reinforcement learning[J].Advances in Neural Information Proces-sing Systems,2021,34:20132-20145. [9]YIN C,LIU R,CATERINO J,et al.Deconfounding Actor-Critic Network with Policy Adaptation for Dynamic Treatment Regimes[C]//Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining.2022:2316-2326. [10]FATEMI M,KILLIAN T W,SUBRAMANIAN J,et al.Medical Dead-ends and Learning to Identify High-risk States and Treatments[C]//Advances in Neural Information Processing Systems 34.2021. [11]TESAURO G.Programming backgammon using self-teachingneural nets[J].Artificial Intelligence,2002,134(1/2):181-199. [12]SILVER D,HUANG A,MADDISON C J,et al.Mastering the game of Go with deep neural networks and tree search[J].Nature,2016,529(7587):484-489. [13]REDDY G,CELANI A,SEJNOWSKI T J,et al.Learning to soar in turbulent environments[J].Proceedings of the National Academy of Sciences,2016,113(33):E4877-E4884. [14]JETER R,JOSEF C,SHASHIKUMARS,et al.Does the “Artificial Intelligence Clinician” learn optimal treatment strategies for sepsis in intensive care?[J].arXiv:1902.03271,2019. [15]LIANG D,DENG H,LIU Y.The treatment of sepsis:an episo-dic memory-assisted deep reinforcement learning approach[J].Applied Intelligence,2023,53(9):11034-11044. [16]YU C,REN G,DONG Y.Supervised-actor-critic reinforcementlearning for intelligent mechanical ventilation and sedative dosing in intensive care units[J].BMC Medical Informatics and Decision Making,2020,20(3):1-8. [17]THOMASP S.Safe reinforcement learning[R].University ofMassachusetts Libraries,2015. [18]THOMAS P S,CASTRO DA SILVA B,BARTO A G,et al.Preventing undesirable behavior of intelligent machines[J].Science,2019,366(6468):999-1004. [19]LAROCHE R,TRICHELAIR P,DES COMBES R T.Safe policy improvement with baseline bootstrapping[C]//International Conference on Machine Learning.PMLR,2019:3652-3661. [20]FATEMI M,SHARMA S,VAN SEIJEN H,et al.Dead-endsand secure exploration in reinforcement learning[C]//International Conference on Machine Learning.PMLR,2019:1873-1881. [21]TAYLOR W K,HAORAN Z,JAYAKUMAR S,et al.An empirical study of representation learning for reinforcement lear-ning in healthcare[C]//Machine Learning for Health.PMLR,2020:139-160. [22]FUJIMOTO S,HOOF H,MEGER D.Addressing function ap-proximation error in actor-critic methods[C]//International Conference on Machine Learning.PMLR,2018:1587-1596. [23]JOHNSON A E W,POLLARD T J,SHEN L,et al.MIMIC-III,a freely accessible critical care database[J].Scientific Data,2016,3(1):1-9. [24]NANAYAKKARA T,CLERMONT G,LANGMEAD C J,et al.Unifying cardiovascular modelling with deep reinforcement learning for uncertainty aware control of sepsis treatment[J].PLOS Digital Health,2022,1(2):e0000012. [25]PEINE A,HALLAWA A,BICKENBACH J,et al.Development and validation of a reinforcement learning algorithm to dynamically optimize mechanical ventilation in critical care[J].NPJ Digital Medicine,2021,4(1):1-12. [26]WENG W H,GAO M,HE Z,et al.Representation and rein-forcement learning for personalized glycemic control in septic patients[J].arXiv:1712.00654,2017. [27]ZHANG Y,CHEN R,TANG J,et al.LEAP:learning to prescribe effective and safe treatment combinations for multimorbidity[C]//Proceedings of the 23rd ACM SIGKDD Interna-tional Conference on Knowledge Discovery and Data Mining.2017:1315-1324. [28]MNIH V,KAVUKCUOGLU K,SILVER D,et al.Human-level control through deep reinforcement learning[J].Nature,2015,518(7540):529-533. [29]LILLICRAP T P,HUNT J J,PRITZEL A,et al.Continuouscontrol with deep reinforcement learning[J].arXiv:1509.02971,2015. |
[1] | ZHANG Qiyang, CHEN Xiliang, ZHANG Qiao. Sparse Reward Exploration Method Based on Trajectory Perception [J]. Computer Science, 2023, 50(1): 262-269. |
[2] | LI Xiaoling, WU Haotian, ZHOU Tao, LU Hui. Password Guessing Model Based on Reinforcement Learning [J]. Computer Science, 2023, 50(1): 334-341. |
[3] | ZHANG Jia-neng, LI Hui, WU Hao-lin, WANG Zhuang. Exploration and Exploitation Balanced Experience Replay [J]. Computer Science, 2022, 49(5): 179-185. |
[4] | DAI Shan-shan, LIU Quan. Action Constrained Deep Reinforcement Learning Based Safe Automatic Driving Method [J]. Computer Science, 2021, 48(9): 235-243. |
[5] | LI Li,ZHENG Jia-li,WANG Zhe,YUAN Yuan,SHI Jing. RFID Indoor Positioning Algorithm Based on Asynchronous Advantage Actor-Critic [J]. Computer Science, 2020, 47(2): 233-238. |
[6] | LI Jie, LING Xing-hong, FU Yu-chen, LIU Quan. Asynchronous Advantage Actor-Critic Algorithm with Visual Attention Mechanism [J]. Computer Science, 2019, 46(5): 169-174. |
[7] | JIN Yu-jing,ZHU Wen-wen,FU Yu-chen and LIU Quan. Actor-Critic Algorithm Based on Tile Coding and Model Learning [J]. Computer Science, 2014, 41(6): 239-242. |
|