Computer Science ›› 2024, Vol. 51 ›› Issue (6A): 230300170-9.doi: 10.11896/jsjkx.230300170
• Artificial Intelligenc • Previous Articles Next Articles
GAO Yuzhao, NIE Yiming
CLC Number:
[1]TAMPUU A,MATIISEN T,KODELJA D,et al.Multiagent cooperation and competition with deep reinforcement learning[J].Plos One,2017,12(4):e0172395. [2]LOWE R,WU Y,TAMAR A,et al.Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments[C]//Advances in Neural Information Processing Systems 30(NIPS 2017).2017. [3]FOERSTER J,FARQUHAR G,AFOURAS T,et al.Counterfactual Multi-Agent Policy Gradients[C]//The Thirty-second AAAI Conference on Artificial Intelligence.New Orleans,Louisiana,Usa:AAAI Press,2018:2974-2982. [4]WONG A,BÄCK T,KONONOVA A V,et al.Deep multiagent reinforcement learning:challenges and directions[J].Artificial Intelligence Review,2022,56:5023-5056. [5]HAO J,YANG T,TANG H,et al.Exploration in Deep Rein-forcement Learning:From Single-Agent to Multiagent Domain[J].IEEE Transactions on Neural Networks and Learning Systems,2023,1(1):1-21. [6]DU F,DING S F.A survey of multi-agent Reinforcement lear-ning[J].Computer Science,2019,46(8):1-8. [7]SUN Y,CAO L,CHEN X L,et al.Overview of multi-agent deep reinforcement learning[J].Computer Engineering and Applications,2020,56(5):13-24. [8]YAN C,XIANG X J,XU X,et al.A Survey on the Scalability and Transferability of Multi-Agent Deep Reinforcement Lear-ning[J].Control and Decision,2023,37(12):3083-3102. [9]XIONG L Q,CAO L,LAI J,et al.Overview of Multi-agent DeepReinforcement Learning Based on Value Factorization[J].Computer Science,2022,49(9):172-182 [10]LI T,ZHU K,LUONG N C,et al.Applications of Multi-Agent Reinforcement Learning in Future Internet:A Comprehensive Survey[J].IEEE Communications Surveys & Amp;Tutorials,2022,24(2):1240-1279. [11]MNIH V,KAVUKCUOGLU K,SILVER D,et al.Playing Atari with Deep Reinforcement Learning[EB/OL].arXiv:1312.5602,2013.https://ui.adsabs.harvard.edu/abs/2013arXiv1312.5602M. [12]HASSELT H V,GUEZ A,SILVER D.Deep ReinforcementLearning with Double Q-Learning[C]//Proceedings of the Thirtieth Aaai Conference on Artificial Intelligence.Phoenix,Arizona:AAAI Press,2016:2094-2100. [13]SON K,KIM D,KANG W,et al.QTRAN:Learning to Factori-ze with Transformation for Cooperative Multi-Agent Reinforcement learning[C]//International Conference on Machine Learning.2019. [14]SUNEHAG P,LEVER G,GRUSLYS A,et al.Value-Decomposition Networks For Cooperative Multi-Agent Learning[EB/OL].arXiv:1706.05296,2017.https://ui.adsabs.harvard.edu/abs/2017arXiv170605296S. [15]MAHAJAN A,RASHID T,SAMVELYAN M,et al.MAVEN:Multi-Agent Variational Exploration[C]//Advances in Neural Information Processing Systems 32(NIPS 2019).California:Neural Information Processing Systems(NIPS),2019. [16]LI B.Hierarchical Architecture for Multi-Agent ReinforcementLearning in Intelligent Game[C]//2022 International Joint Conference on Neural Networks(IJCNN).New York:IEEE,2022. [17]WANG W,YANG T,LIU Y,et al.From Few to More:Large-Scale Dynamic Multiagent Curriculum Learning[C]//Thirty-fourth Aaai Conference on Artificial Intelligence,the Thirty-se-cond Innovative Applications of Artificial Intelligence Conference and the Tenth Aaai Symposium on Educational Advances in Artificial Intelligence.New York:Assoc Advancement Artificial Intelligence,2020:7293-7300. [18]COHEN A,TENG E,BERGES V,et al.On the Use and Misuse of Absorbing States in Multi-agent Reinforcement Learning[EB/OL].arXiv:2111.05992,2021.https://ui.adsabs.harvard.edu/abs/2021arXiv211105992C.10.48550/arXiv.2111.05992. [19]RASHID T,SAMVELYAN M,DE WITT C,et al.MonotonicValue Function Factorisation for Deep Multi-Agent Reinforcement Learning[J].Journal of Machine Learning Research,2020,21. [20]YANG Y,HAO J,LIAO B,et al.Qatten:A General Framework for Cooperative Multiagent Reinforcement Learning[EB/OL].arXiv:2002.03939,2020.https://ui-adsabs-harvard-edu-s.libyc.nudt.edu.cn:443/abs/2020arXiv200203939Y. [21]WANG J,REN Z,LIU T,et al.QPLEX:Duplex Dueling Multi-Agent Q-Learning[EB/OL].arXiv:2008.01062,2020.https://ui-adsabs-harvard-edu-s.libyc.nudt.edu.cn:443/abs/2020arXiv200801062W. [22]WANG Z,SCHAUL T,HESSEL M,et al.Dueling Network Architectures for Deep Reinforcement Learning[C]//International Conference on Machine Learning.2016. [23]SIQI S,MENGWEI Q,JUN L.ResQ:A Residual Q Function-based Approach for Multi-Agent Reinforcement Learning Value Factorization[C]//36th Conference on Neural Information Processing Systems.New York:Curran Associates,2022:5471-5483. [24]PINA R,DE SILVA V,HOOK J,et al.Residual Q-Networksfor Value Function Factorizing in Multi-Agent Reinforcement Learning[J].IEEE Transactions on Neural Networks and Learning Systems,2024,35(2):1534-1544. [25]RASHID T,FARQUHAR G,PENG B,et al.Weighted QMIX:Expanding Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning[C]//Advances in Neural Information Processing Systems 33(NEURIPS 2020).New York:Curran Associates,2020:10199-10210. [26]DU W,DING S,GUO L,et al.Value function factorization with dynamic weighting for deep multi-agent reinforcement learning[J].Information Sciences,2022,615:191-208. [27]REHMAN H M R U,ON B,NINGOMBAM D D,et al.QSOD:Hybrid Policy Gradient for Deep Multi-agent Reinforcement Learning[J].Ieee Access,2021,9:129728-129741. [28]HALL G,HOLLADAY K.Adaptive Average Exploration inMulti-Agent Reinforcement Learning[C]//2020 AIAA/IEEE 39th Digital Avionics Systems Conference(DASC) Proceedings.New York:IEEE,2020. [29]JIANG H,SHI D,XUE C,et al.GHGC:Goal-based Hierarchical Group Communication in Multi-Agent Reinforcement Lear-ning[C]//2020 IEEE International Conference on Systems,Man,and Cybernetics(SMC).New York:IEEE,2020:3507-3514. [30]XIONG L,CAO L,CHEN X,et al.A Value Factorization Me-thod for MARL Based on Correlation between Individuals[J].Mathematical Problems in Engineering,2022,2022:1-8. [31]BAI Y,GONG C,ZHANG B,et al.Cooperative Multi-AgentReinforcement Learning with Hypergraph Convolution[C]//2022 International Joint Conference on Neural Networks(IJCNHN).New York:IEEE,2022. [32]YUN W J,YI S,KIM J.Multi-Agent Deep ReinforcementLearning using Attentive Graph Neural Architectures for Real-Time Strategy Games[C]//2021 IEEE International Conference on Systems,Man,and Cybernetics(SMC).New York:IEEE,2021:2967-2972. [33]SUN W,LEE C,LEE C.DFAC Framework:Factorizing theValue Function via Quantile Mixture for Multi-Agent Distributional Q-Learning[C]//International Conference on Machine Learning.2021. [34]XU Z,LI D,BAI Y,et al.MMD-MIX:Value Function Factorisation with Maximum Mean Discrepancy for Cooperative Multi-Agent Reinforcement Learning[C]//2021 International Joint Conference on Neural Networks(IJCNHN).New York:IEEE,2021. [35]HUANG L,FU M,RAO A,et al.A Distributional Perspective on Multiagent Cooperation With Deep Reinforcement Learning[J].IEEE Transactions on Neural Networks and Learning Systems,2024,35(3):4246-4259. [36]YANG G,CHEN H,ZHANG J,et al.Multi-Agent Uncertainty Sharing for Cooperative Multi-Agent Reinforcement Learning[C]//2022 International Joint Conference on Neural Networks(IJCNN).New York:IEEE,2022:1-8. [37]LIU X,LI X,LI Y,et al.PS-QMix:A Parallel Learning Framework for Q-Mix Using Parameter Server[C]//Advanced Data Mining and Applications(ADMA 2021).2022:341-352. [38]WAN K,XU X,LI Y.Learning Distinct Strategies for Heterogeneous Cooperative Multi-agent Reinforcement Learning[C]//Artificial Neural Networks and Machine Learning(ICANN 2021).Switzerland:Springer International Publishing AG,2021:544-555. [39]LIQIN X,LEI C,XILIANG C,et al.Character-Based Value Factorization For MADRL[J].The Computer Journal,2023,66(11):2782-2793. [40]WU H,ZHANG J,WANG Z,et al.Sub-AVG:Overestimation reduction for cooperative multi-agent reinforcement learning[J].Neurocomputing,2022,474:94-106. [41]CHAI J,LI W,ZHU Y,et al.UNMAS:Multiagent Reinforcement Learning for Unshaped Cooperative Scenarios[J].IEEE Transactions on Neural Networks and Learning Systems,2023,34(4):2093-2104. [42]NADERIALIZADEH N,HUNG F H,SOLEYMAN S,et al.Graph Convolutional Value Decomposition in Multi-Agent Reinforcement Learning[EB/OL].2020:arXiv:2010.04740.https://ui.adsabs.harvard.edu/abs/2020arXiv201004740N.10.48550/arXiv.2010.04740. [43]ZHOU T,ZHANG F,SHAO K,et al.Cooperative Multi-Agent Transfer Learning with Level-Adaptive Credit Assignment[EB/OL].arXiv:2106.00517,2021.https://ui.adsabs.harvard.edu/abs/2021arXiv210600517Z.10.48550/arXiv.2106.00517. [44]CHEN H,YANG G,ZHANG J,et al.RACA:Relation-AwareCredit Assignment for Ad-Hoc Cooperation in Multi-Agent Deep Reinforcement Learning[C]//2022 International Joint Conference on Neural Networks(IJCNN).New York:IEEE,2022. [45]ZHANG T,XU H,WANG X,et al.Multi-Agent Collaboration via Reward Attribution Decomposition[EB/OL].arXiv:2010.08531,2020.https://ui.adsabs.harvard.edu/abs/2020arXiv201008531Z. [46]HU S,ZHU F,CHANG X,et al.UPDeT:Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transfor-mers[EB/OL].arXiv:2101.08001,2021.https://ui.adsabs.harvard.edu/abs/2021arXiv210108001H.10.48550/arXiv.2101.08001. [47]IQBAL S,DE WITT C,PENG B,et al.Randomized Entity-wise Factorization for Multi-Agent Reinforcement Learning[C]//International Conference on Machine Learning.San Diego:Jmlr-Journal Machine Learning Research,2021. [48]LEI C,ZHAO H,ZHOU L,et al.Intelligent Dynamic Spectrum Allocation in MEC-Enabled Cognitive Networks:A Multiagent Reinforcement Learning Approach[J].Wireless Communications and Mobile Computing,2022,2022:1-13. [49]GUO Z,CHEN Z,LIU P,et al.Multi-Agent ReinforcementLearning-Based Distributed Channel Access for Next Generation Wireless Networks[J].IEEE Journal on Selected Areas in Communications,2022,40(5):1587-1599. [50]WANG Z,ZONG J,ZHOU Y,et al.Decentralized Multi-Agent Power Control in Wireless Networks With Frequency Reuse[J].IEEE Transactions on Communications,2022,70(3):1666-1681. [51]HAN C,YAO H,MAI T,et al.QMIX Aided Routing in Social-Based Delay-Tolerant Networks[J].IEEE Transactions on Vehicular Technology,2022,71(2):1952-1963. [52]MSEDDI A,JAAFAR W,MOUSSAID A,et al.CollaborativeD2D Pairing in Cache-Enabled Underlay Cellular Networks[C]//2021 IEEE Global Communications Conference(globecom).New York:IEEE,2021:1-6. [53]YU Z,NING N W,ZHENG Y L,et al.Survey of Intelligent Traffic Signal Control Strategies Driven by Deep reinforcement learning[J].Computer Science,2023,50(4):159-171. [54]WANG Z,ZHU H,HE M,et al.GAN and Multi-Agent DRLBased Decentralized Traffic Light Signal Control[J].IEEE Transactions on Vehicular Technology,2022,71(2):1333-1348. [55]ZHANG Z,QIAN J,FANG C,et al.Coordinated Control of Distributed Traffic Signal Based on Multiagent Cooperative Game[J].Wireless Communications and Mobile Computing,2021,2021:1-13. [56]CHEN X,XIONG G,LV Y,et al.A Collaborative Communication-Qmix Approach for Large-scale Networked Traffic Signal Control[C]//2021 IEEE International Intelligent Transportation Systems Conference(ITSC).New York:IEEE,2021:3450-3455. [57]ZHANG S,ZHUAN X.Distributed Model Predictive Controlfor Two-Dimensional Electric Vehicle Platoon Based on QMIX Algorithm[J].Symmetry,2022,14(10):2069. [58]YUAN Z,WU T,WANG Q,et al.T3OMVP:A Transformer-Based Time and Team Reinforcement Learning Scheme for Observation-Constrained Multi-Vehicle Pursuit in Urban Area[J].Electronics,2022,11(9):1339. [59]ZHOU T,KRIS M L,CREIGHTON D,et al.GMIX:Graph-based spatial-temporal multi-agent reinforcement learning for dynamic electric vehicle dispatching system[J].Transportation Research Part C:Emerging Technologies,2022,144:103886. [60]YIN Y,GUO Y,SU Q,et al.Task Allocation of Multiple Unmanned Aerial Vehicles Based on Deep Transfer Reinforcement Learning[J].Drones,2022,6(8):215. [61]WANG J,ZHANG X,HE X,et al.Bandwidth Allocation andTrajectory Control in UAV-Assisted IoV Edge Computing Using Multiagent Reinforcement Learning[J].IEEE Transactions on Reliability,2023,72(2):599-608. [62]DING R,CHEN J,WU W,et al.Packet Routing in Dynamic Multi-Hop UAV Relay Network:A Multi-Agent Learning Approach[J].IEEE Transactions on Vehicular Technology,2022,71(9):10059-10072. [63]RITZ F,PHAN T,MÜLLER R,et al.SAT-MARL:Specifica-tion Aware Training in Multi-Agent Reinforcement Learning[C]//Proceedings of the 13th International Conference on Agents and Artificial Intelligence:SCITEPRESS-Science and Technology Publications.2021:28-37. [64]CHOI H,KIM J,HAN Y,et al.MARL-Based CooperativeMulti-AGV Control in Warehouse Systems[J].IEEE Access,2022,10:100478-100488. [65]HANHAN Z,TIAN L,VANEET A.PAC:Assisted Value Factorisation with Counterfactual Predictions in Multi-Agent Reinforcement Learning[C]//36th Conference on Neural Information Processing Systems.New York:Curran Associates,2022:15757-15769. [66]WANG Y,HAN B,WANG T,et al.Off-Policy Multi-Agent Decomposed Policy Gradients[EB/OL].arXiv:2007.12322.https://ui.adsabs.harvard.edu/abs/2020arXiv200712322W.10.48550/arXiv.2007.12322. [67]WANG T,WANG J,ZHENG C,et al.Learning Nearly Decomposable Value Functions Via Communication Minimization[EB/OL].arXiv:1910.05366.https://ui.adsabs.harvard.edu/abs/2019arXiv191005366W. |
[1] | Cui ZHANG, En WANG, Funing YANG, Yong jian YANG , Nan JIANG. UAV Frequency-based Crowdsensing Using Grouping Multi-agentDeep Reinforcement Learning [J]. Computer Science, 2023, 50(2): 57-68. |
[2] | LI Bei, WU Hao, HE Xiaowei, WANG Bin, XU Ergang. Survey of Storage Scalability in Blockchain Systems [J]. Computer Science, 2023, 50(1): 318-333. |
[3] | CHAO Le-men, WANG Rui. Data Science Platform:Features,Technologies and Trends [J]. Computer Science, 2021, 48(8): 1-12. |
[4] | LI Ying, YU Ya-xin, ZHANG Hong-yu, LI Zhen-guo. High Trusted Cloud Storage Model Based on TBchain Blockchain [J]. Computer Science, 2020, 47(9): 330-338. |
[5] | ZHUANG Yuan, GUO Qiang, ZHANG Jie, ZENG Yun-hui. Large Scalability Method of 2D Computation on Shenwei Many-core [J]. Computer Science, 2020, 47(8): 87-92. |
[6] | YE Shao-jie, WANG Xiao-yi, XU Cai-chao, SUN Jian-ling. BitXHub:Side-relay Chain Based Heterogeneous Blockchain Interoperable Platform [J]. Computer Science, 2020, 47(6): 294-302. |
[7] | WU Bin-feng. Design of IoT Middleware Based on Microservices Architecture [J]. Computer Science, 2019, 46(6A): 580-584. |
[8] | ZHAO Xing-wang,LIANG Ji-ye,GUO Lan-jie. Collaborative Filtering Recommendation Algorithm Based on Space Transformation [J]. Computer Science, 2018, 45(7): 16-21. |
[9] | ZHANG Shi-jiang, CHAI Jing, CHEN Ze-hua and HE Hai-wu. Byzantine Consensus Algorithm Based on Gossip Protocol [J]. Computer Science, 2018, 45(2): 20-24. |
[10] | HAI Mo and ZHANG You. Performance Comparison of Clustering Algorithms in Spark [J]. Computer Science, 2017, 44(Z6): 414-418. |
[11] | ZHOU Qiang, XIE Jing and ZHAO Hua-ming. Architecture and Solution for Large Web Sites [J]. Computer Science, 2017, 44(Z6): 587-590. |
[12] | TANG Bing, Laurent BOBELIN and HE Hai-wu. Parallel Algorithm of Nonnegative Matrix Factorization Based on Hybrid MPI and OpenMP Programming Model [J]. Computer Science, 2017, 44(3): 51-54. |
[13] | LIU Lin and ZHOU Jian-tao. Review for Research of Control Plane in Software-defined Network [J]. Computer Science, 2017, 44(2): 75-81. |
[14] | ZHENG Sheng and LI Tong. Data Placement Algorithm for Large-scale Storage System [J]. Computer Science, 2013, 40(Z11): 270-273. |
[15] | . Parallel Benchmark for Evaluating Parallel Simulation Engine [J]. Computer Science, 2013, 40(3): 41-45. |
|