Computer Science ›› 2023, Vol. 50 ›› Issue (8): 202-208.doi: 10.11896/jsjkx.220500270
• Artificial Intelligence • Previous Articles Next Articles
XIONG Liqin, CAO Lei, CHEN Xiliang, LAI Jun
CLC Number:
[1]SILVER D,HUANG A,MADDISON C J,et al.Mastering the game of Go with deep neural networks and tree search[J].Nature,2016,529(7587):484-489. [2]MNIH V,KAVUKCUOGLU K,SILVER D,et al.Human-level control through deep reinforcement learning[J].Nature,2015,518(7540):529-533. [3]LI Y,XU F,XIE G Q,et al.Survey of development and application of multi-agent technology[J].Computer Engineering and Applications,2018,54(9):13-21. [4]SUN Y,CAO L,CHEN X L,et al.Overview of multi-agent deep reinforcement learning[J].Computer engineering and Application,2020,56(5):13-24. [5]SUNEHAG P,LEVER G,GRUSLYS A,et al.Value-Decomposition Networks For Cooperative Multi-Agent Learning Based on Team Reward[C]//Proceedings of the 17th International Conference on Autonomous Agents and Multi-Agent Systems.2018:2085-2087. [6]RASHID T,SAMVELYAN M,SCHROEDER C,et al.Qmix:Monotonic value function factorisation for deep multi-agent reinforcement learning[C]//International Conference on Machine Learning.2018:4295-4304. [7]FOERSTER J,FARQUHAR G,AFOURAS T,et al.Counterfactual multi-agent policy gradients[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2018:2974-2982. [8]TAMPUU A,MATIISEN T,KODELJA D,et al.Multiagentcooperation and competition with deep reinforcement learning[J].PloS one,2017,12(4):e0172395. [9]RASHID T,FARQUHAR G,PENG B,et al.Weighted QMIX:Expanding Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning[C]//Advances in Neural Information Processing Systems.2020:10199-10210. [10]IQBAL S,WITT C S D,PENG B,et al.AI-QMIX:Attentionand Imagination for Dynamic Multi-Agent Reinforcement Lear-ning[J].arXiv:2006.04222,2020. [11]ZHAO J,YANG M,HU X,et al.DQMIX:A Distributional Pers-pective on Multi-Agent Reinforcement Learning[J].arXiv:2202.10134,2022. [12]YAO X,WEN C,WANG Y,et al.SMIX(λ):Enhancing Centra-lized Value Functions for Cooperative Multi-Agent Reinforcement Learning[J].IEEE Transactions on Neural Networks and Learning Systems,2021,6:1-12. [13]SON K,KIM D,KANG W J,et al.Qtran:Learning to factorize with transformation for cooperative multi-agent reinforcement learning[C]//International Conference on Machine Learning.2019:5887-5896. [14]SON K,AHN S,REYES R D,et al.QTRAN++:Improved Value Transformation for Cooperative Multi-Agent Reinforcement Learning[J].arXiv:2006.12010,2020. [15]YANG Y,HAO J,LIAO B,et al.Qatten:A general framework for cooperative multiagent reinforcement learning[J].arXiv:2002.03939,2020. [16]ZHANG Y,MA H,WANG Y.AVD-Net:Attention Value Decomposition Network For Deep Multi-Agent Reinforcement Learning[C]//2020 25th International Conference on Pattern Recognition(ICPR).2021:7810-7816. [17]WANG J,REN Z,LIU T,et al.QPLEX:Duplex Dueling Multi-Agent Q-Learning[J].arXiv:2008.01062,2020. [18]IQBAL S,DE WITT C A S,PENG B,et al.Randomized Entity-wise Factorization for Multi-Agent Reinforcement Learning[C]//International Conference on Machine Learning.2021:4596-4606. [19]VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[C]//Advances in Neural Information Processing Systems.2017:5998-6008. [20]XU Z,LI D,BAI Y,et al.MMD-MIX:Value Function Factorisation with Maximum Mean Discrepancy for Cooperative Multi-Agent Reinforcement Learning[C]//2021 International Joint Conference on Neural Networks(IJCNN).2021:1-7. [21]FOERSTER J N,ASSAEL Y M,DE FREITAS N,et al.Lear-ning to communicate with Deep multi-agent reinforcement lear-ning[C]//Proceedings of the 30th International Conference on Neural Information Processing Systems.2016:2145-2153. [22]WU B,YANG X,SUN C,et al.Learning Effective Value Function Factorization via Attentional Communication[C]//2020 IEEE International Conference on Systems,Man,and Cyberne-tics(SMC).2020:629-634. [23]ZHOU H,LAN T,AGGARWAL V.Value Functions Factorization with Latent State Information Sharing in Decentralized Multi-Agent Policy Gradients[J].arXiv:2201.01247,2022. [24]OLIEHOEK F A,SPAAN M T,VLASSIS N.Optimal and Approximate Q-value Functions for Decentralized POMDPs[J].Journal of Artificial Intelligence Research,2008,32:289-353. [25]HAUSKNECHT M,STONE P.Deep recurrent Q-learningfor partially observable mdps[C]//2015 AAAI Fall Symposium Series.2015:29-37. |
[1] | JIN Tiancheng, DOU Liang, ZHANG Wei, XIAO Chunyun, LIU Feng, ZHOU Aimin. OJ Exercise Recommendation Model Based on Deep Reinforcement Learning and Program Analysis [J]. Computer Science, 2023, 50(8): 58-67. |
[2] | LIN Xiangyang, XING Qinghua, XING Huaixi. Study on Intelligent Decision Making of Aerial Interception Combat of UAV Group Based onMADDPG [J]. Computer Science, 2023, 50(6A): 220700031-7. |
[3] | WANG Hanmo, ZHENG Shijie, XU Ruonan, GUO Bin, WU Lei. Self Reconfiguration Algorithm of Modular Robot Based on Swarm Agent Deep Reinforcement Learning [J]. Computer Science, 2023, 50(6): 266-273. |
[4] | ZHANG Qiyang, CHEN Xiliang, CAO Lei, LAI Jun, SHENG Lei. Survey on Knowledge Transfer Method in Deep Reinforcement Learning [J]. Computer Science, 2023, 50(5): 201-216. |
[5] | YU Ze, NING Nianwen, ZHENG Yanliu, LYU Yining, LIU Fuqiang, ZHOU Yi. Review of Intelligent Traffic Signal Control Strategies Driven by Deep Reinforcement Learning [J]. Computer Science, 2023, 50(4): 159-171. |
[6] | XU Linling, ZHOU Yuan, HUANG Hongyun, LIU Yang. Real-time Trajectory Planning Algorithm Based on Collision Criticality and Deep Reinforcement Learning [J]. Computer Science, 2023, 50(3): 323-332. |
[7] | Cui ZHANG, En WANG, Funing YANG, Yong jian YANG , Nan JIANG. UAV Frequency-based Crowdsensing Using Grouping Multi-agentDeep Reinforcement Learning [J]. Computer Science, 2023, 50(2): 57-68. |
[8] | HUANG Yuzhou, WANG Lisong, QIN Xiaolin. Bi-level Path Planning Method for Unmanned Vehicle Based on Deep Reinforcement Learning [J]. Computer Science, 2023, 50(1): 194-204. |
[9] | RONG Huan, QIAN Minfeng, MA Tinghuai, SUN Shengjie. Novel Class Reasoning Model Towards Covered Area in Given Image Based on InformedKnowledge Graph Reasoning and Multi-agent Collaboration [J]. Computer Science, 2023, 50(1): 243-252. |
[10] | ZHANG Qiyang, CHEN Xiliang, ZHANG Qiao. Sparse Reward Exploration Method Based on Trajectory Perception [J]. Computer Science, 2023, 50(1): 262-269. |
[11] | WEI Nan, WEI Xianglin, FAN Jianhua, XUE Yu, HU Yongyang. Backdoor Attack Against Deep Reinforcement Learning-based Spectrum Access Model [J]. Computer Science, 2023, 50(1): 351-361. |
[12] | SHI Dian-xi, ZHAO Chen-ran, ZHANG Yao-wen, YANG Shao-wu, ZHANG Yong-jun. Adaptive Reward Method for End-to-End Cooperation Based on Multi-agent Reinforcement Learning [J]. Computer Science, 2022, 49(8): 247-256. |
[13] | YU Bin, LI Xue-hua, PAN Chun-yu, LI Na. Edge-Cloud Collaborative Resource Allocation Algorithm Based on Deep Reinforcement Learning [J]. Computer Science, 2022, 49(7): 248-253. |
[14] | LI Meng-fei, MAO Ying-chi, TU Zi-jian, WANG Xuan, XU Shu-fang. Server-reliability Task Offloading Strategy Based on Deep Deterministic Policy Gradient [J]. Computer Science, 2022, 49(7): 271-279. |
[15] | XIE Wan-cheng, LI Bin, DAI Yue-yue. PPO Based Task Offloading Scheme in Aerial Reconfigurable Intelligent Surface-assisted Edge Computing [J]. Computer Science, 2022, 49(6): 3-11. |
|