Computer Science ›› 2025, Vol. 52 ›› Issue (1): 323-330.doi: 10.11896/jsjkx.240800072
• Artificial Intelligence • Previous Articles Next Articles
WANG Yanning1,2, ZHANG Fengdi1,2, XIAO Dengmin3, SUN Zhongqi4
CLC Number:
[1]WEN G H,YANG T,ZHOU J L,et al.Reinforcement learning and adaptive/approximate dynamic programming:A survey from theory to applications in multi-agent systems[J].Control and Decision,2023,38(5):1200-1230. [2]ZHANG M Y,DOU Y J,CHEN Z Y,et al.Review of deep rein-forcement learning and its applications in military field[J].Systems Engineering and Electronics,2024,46(4):1297-1308. [3]HAO J Y,SHAO K,LI K,et al.Research and Application ofGame Intelligence[J].SCIENTIA SINICA(Informationis),2023,53(10):1892-1923. [4]KHATIB O.Real-time obstacle avoidance for manipulators and mobile robots[C]//IEEE International Conference on Robotics and Automation(ICRA).IEEE,1985:500-505. [5]WANG X F,GU K R.A penetration strategy combining deep reinforcement learning and imitation learning[J].Journal of Astronautics,2023,44(6):914-925. [6]LI Y Z,SONG J M,ERMON S.InfoGAIL:Interpretable imitation learning from visual demonstrations[C]//31st International Conference on Neural Information Processing Systems(NIPS).Cambridge:MIT Press,2017:3815-3825. [7]WANG Z Y,MEREL J,REED S,et al.Robust imitation of diverse behaviors[C]//31st International Conference on Neural Information Processing Systems(NIPS).Cambridge:MIT Press,2017:5326-5335. [8]JOSH M,TASSA Y,DHRUVA T,et al.Learning human behaviors from motion capture by adversarial imitation[J].arXiv:1707.02201,2017. [9]LIN J H,ZHANG Z Z.ACGAIL:Imitation learning about multiple intentions with auxiliary classifier GANs[C]//15th Pacific Rim International Conference on Artificial Intelligence(PRICAI).Switzerland:Springer,Cham,2018:321-334. [10]RAUNAK P B,DEREK J P,BLAKE W,et al.Multi-agent imitation learning for driving simulation[C]//IEEE/RSJ International Conference on Intelligent Robots and Systems(IROS).Piscataway:IEEE,2018:1534-1539. [11]FU Y P,DENG X Y,ZHU Z Q,et al.Fixed-wing aircraft attitude controller based on imitation reinforcement learning[J].Journal of Naval Aeronautical and Astronautical University,2022,37(5):393-399. [12]WANG H J,TAO Y,LU C F.A Reinforcement ImitationLearning-based Robot Navigation Method with Collision Prediction[J].Computer Engineering and Applications,2024,60(10):341-352. [13]POMERLEAU D A.Efficient training of artificial neural net-works for autonomous navigation[J].Neural Computation,1991,3(1):88-97. [14]BOJARSKI M,TESTA D D,DWORAKOWSKI D,et al.End to end learning for self-driving cars[J].arXiv:1604.07316,2016. [15]PFLUEGER M,AGHA A,SUKHATME S G.Rover-IRL:Inverse reinforcement learning with soft value iteration networks for planetary rover path planning[J].IEEE Robotics and Automation Letters,2019,4(2):1387-1394. [16]ANDREW Y N,STUART J R.Algorithms for inverse rein-forcement learning[C]//17th International Conference on Machine Learning(ICML).Association for Computing Machinery,2000:663-670. [17]WU S B,FU Q M,CHEN J P,et al.Meta-inverse reinforcement learning method based on relative entropy[J].Computer Science,2021,48(9):257-263. [18]JONATHAN H,STEFANO E.Generative adversarial imitation learning[C]//30th International Conference on Neural Information Processing Systems.Curran Associates Inc,2016:4572-4580. [19]JIANG C,ZHANG Z C,CHEN Z X,et al.Data efficient third-person imitation learning method[J].Computer Science,2021,48(2):238-244. [20]XIAO D M,WANG B,SUN Z Q,et al.Behavioral cloning based model generation method for reinforcement learning[C]//China Automation Congress(CAC).IEEE,2023:6776-6781. [21]XIAO D M,WANG B,SUN Z Q,et al.Imitation learning me-thod of multi-quality expert data based on GAIL[C]//China Symposium on Cognitive Computing and Hybrid Intelligence(CCHI).IEEE,2023:8642-8647. |
[1] | BAO Zepeng, QIAN Tieyun. Survey on Large Model Red Teaming [J]. Computer Science, 2025, 52(1): 34-41. |
[2] | LI Tingting, WANG Qi, WANG Jiakang, XU Yongjun. SWARM-LLM:An Unmanned Swarm Task Planning System Based on Large Language Models [J]. Computer Science, 2025, 52(1): 72-79. |
[3] | YAN Yusong, ZHOU Yuan, WANG Cong, KONG Shengqi, WANG Quan, LI Minne, WANG Zhiyuan. COA Generation Based on Pre-trained Large Language Models [J]. Computer Science, 2025, 52(1): 80-86. |
[4] | WANG Qidi, SHEN Liwei, WU Tianyi. Option Discovery Method Based on Symbolic Knowledge [J]. Computer Science, 2025, 52(1): 277-288. |
[5] | YAN Xin, HUANG Zhiqiu, SHI Fan, XU Heng. Study on Following Car Model with Different Driving Styles Based on Proximal PolicyOptimization Algorithm [J]. Computer Science, 2024, 51(9): 223-232. |
[6] | WANG Tianjiu, LIU Quan, WU Lan. Offline Reinforcement Learning Algorithm for Conservative Q-learning Based on Uncertainty Weight [J]. Computer Science, 2024, 51(9): 265-272. |
[7] | ZHOU Wenhui, PENG Qinghua, XIE Lei. Study on Adaptive Cloud-Edge Collaborative Scheduling Methods for Multi-object State Perception [J]. Computer Science, 2024, 51(9): 319-330. |
[8] | LI Jingwen, YE Qi, RUAN Tong, LIN Yupian, XUE Wandong. Semi-supervised Text Style Transfer Method Based on Multi-reward Reinforcement Learning [J]. Computer Science, 2024, 51(8): 263-271. |
[9] | WANG Xianwei, FENG Xiang, YU Huiqun. Multi-agent Cooperative Algorithm for Obstacle Clearance Based on Deep Deterministic PolicyGradient and Attention Critic [J]. Computer Science, 2024, 51(7): 319-326. |
[10] | GAO Yuzhao, NIE Yiming. Survey of Multi-agent Deep Reinforcement Learning Based on Value Function Factorization [J]. Computer Science, 2024, 51(6A): 230300170-9. |
[11] | ZHONG Yuang, YUAN Weiwei, GUAN Donghai. Weighted Double Q-Learning Algorithm Based on Softmax [J]. Computer Science, 2024, 51(6A): 230600235-5. |
[12] | LI Danyang, WU Liangji, LIU Hui, JIANG Jingqing. Deep Reinforcement Learning Based Thermal Awareness Energy Consumption OptimizationMethod for Data Centers [J]. Computer Science, 2024, 51(6A): 230500109-8. |
[13] | WANG Shuanqi, ZHAO Jianxin, LIU Chi, WU Wei, LIU Zhao. Fuzz Testing Method of Binary Code Based on Deep Reinforcement Learning [J]. Computer Science, 2024, 51(6A): 230800078-7. |
[14] | HUANG Feihu, LI Peidong, PENG Jian, DONG Shilei, ZHAO Honglei, SONG Weiping, LI Qiang. Multi-agent Based Bidding Strategy Model Considering Wind Power [J]. Computer Science, 2024, 51(6A): 230600179-8. |
[15] | XIN Yuanxia, HUA Daoyang, ZHANG Li. Multi-agent Reinforcement Learning Algorithm Based on AI Planning [J]. Computer Science, 2024, 51(5): 179-192. |
|