基于图注意力的分组多智能体强化学习方法

doi:10.11896/jsjkx.240700107

Abstract

Abstract: Currently,multi-agent reinforcement learning is widely applied in various cooperative tasks.In real environments,agents always have access to only partial observations,leading to inefficient exploration of cooperative strategies.Moreover,sharing reward values among agents makes it challenging to accurately assess individual contributions.To address these issues,a novel graph attention-based grouped multi-agent reinforcement learning framework is proposed,which improves cooperation efficiency and enhances the evaluation of individual contributions.Firstly,a multi-agent system with graph structure is constructed,which learning relationships among the individual agents and their neighbors for sharing information.This approach expands individual agents’ perceptual fields to mitigate constraints from partial observability and assess individual contributions.Secondly,an action reference module is designed to provide joint action reference information for individual action selection,enabling agents to explore more efficiently and diversely.Experimental results in two different scales of multi-agent control scenarios demonstrate significant advantages over baseline methods.Detailed ablation studies further verify the effectiveness of the graph attention grouping approach and communication settings.

Key words: Multi-agent reinforcement learning, Graph attention network, Centralized training decentralized execution, Multi-agent cooperation, Multi-agent communication

CLC Number:

TP391

ZHU Shihao, PENG Kexing, MA Tinghuai. Graph Attention-based Grouped Multi-agent Reinforcement Learning Method[J].Computer Science, 2025, 52(9): 330-336.

References

[1]LI L,ZHAO W,WANG C,et al.Nash double Q-based multi-agent deep reinforcement learning for interactive merging strategy in mixed traffic[J].Expert Systems with Applications,2024,237:121458.
[2]OROOJLOOY A,HAJINEZHAD D.A review of cooperativemulti-agent deep reinforcement learning[J].Applied Intelligence,2023,53(11):13677-13722.
[3]LI T,ZHU K,LUONG N C,et al.Applications of multi-agent reinforcement learning in future internet:A comprehensive survey[J].IEEE Communications Surveys & Tutorials,2022,24(2):1240-1279.
[4]LIU Q,SZEPESVÁRI C,JIN C.Sample-efficient reinforcement learning of partially observable markov games[C]//Advances in Neural Information Processing Systems.2022:18296-18308.
[5]ZHANG K,YANG Z,BAŞAR T.Multi-agent reinforcementlearning:A selective overview of theories and algorithms[M]//Handbook of Reinforcement Learning and Control.2021:321-384.
[6]YARAHMADI H,SHIRI M E,NAVIDI H,et al.Bankruptcy-evolutionary games based solution for the multi-agent credit assignment problem[J].Swarm and Evolutionary Computation,2023,77:101229.
[7]JIANG K,LIU W,WANG Y,et al.Credit assignment in heterogeneous multi-agent reinforcement learning for fully cooperative tasks[J].Applied Intelligence,2023,53(23):29205-29222.
[8]FOERSTER J,FARQUHAR G,AFOURAS T,et al.Counterfactual multi-agent policy gradients[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2018:2974-2982.
[9]KIM W,PARK J,SUNG Y.Communication in multi-agent reinforcement learning:Intention sharing[C]//International Confe-rence on Learning Representations.2020:1-15.
[10]LIU Y,WANG W,HU Y,et al.Multi-agent game abstraction via graph attention neural network[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2020:7211-7218.
[11]NIU Y,PALEJA R R,GOMBOLAY M C.Multi-Agent Graph-Attention Communication and Teaming[C]//Proceedings of the 20th International Conference on Autonomous Agents and Multiagent Systems.2021:964-973.
[12]RASHID T,SAMVELYAN M,DE WITT C S,et al.Monotonic value function factorisation for deep multi-agent reinforcement learning[J].Journal of Machine Learning Research,2020,21(178):1-51.
[13]SON K,KIM D,KANG W J,et al.Qtran:Learning to factorize with transformation for cooperative multi-agent reinforcement learning[C]//International Conference on Machine Learning.2019:5887-5896.
[14]NADERIALIZADEH N,HUNG F H,SOLEYMAN S,et al.Graph convolutional value decomposition in multi-agent reinforcement learning[J].arXiv:2010.04740,2020.
[15]WANG T,DONG H,LESSER V,et al.ROMA:multi-agent reinforcement learning with emergent roles[C]//Proceedings of the 37th International Conference on Machine Learning.2020:9876-9886.
[16]WANG Y,HAN B,WANG T,et al.Dop:Off-policy multi-agent decomposed policy gradients[C]//International Conference on Learning Representations.2020:1-24.
[17]DU Y,HAN L,FANG M,et al.Liir:Learning individual intrinsic reward in multi-agent reinforcement learning[C]//Advances in Neural Information Processing Systems.2019,32:1-12.
[18]MAHAJAN A,RASHID T,SAMVELYAN M,et al.Maven:Multi-agent variational exploration[C]//Advances in Neural Information Processing Systems.2019:1-12.
[19]SUNEHAG P,LEVER G,GRUSLYS A,et al.Value-Decomposition Networks For Cooperative Multi-Agent Learning Based On Team Reward[C]//Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems.2018:2085-2087.
[20]WANG T,GUPTA T,MAHAJAN A,et al.Rode:Learningroles to decompose multi-agent tasks[J].arXiv:2010.01523,2020.
[21]JIANG J,LU Z.Learning attentional communication for multi-agent cooperation[C]//Advances in Neural Information Processing Systems.2018:1-11.
[22]WANG X,KE L,QIAO Z,et al.Large-scale traffic signal control using a novel multiagent reinforcement learning[J].IEEE Transactions on Cybernetics,2020,51(1):174-187.
[23]YANG S,YANG B,ZENG Z,et al.Causal inference multi-agent reinforcement learning for traffic signal control[J].Information Fusion,2023,94:243-256.
[24]SAMVELYAN M,RASHID T,SCHROEDER DE WITT C,et al.The StarCraft Multi-Agent Challenge[C]//Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems.2019:2186-2188.
[25]HAN Z R,QIAN Y H,LIU G Q.Multi Agent Communication Based on Self Attention and Reinforcement Learning[J].Journal of Chinese Computer Systems,2023,44(6):1134-1139.

Related Articles 15

[1]	ZHOU Tao, DU Yongping, XIE Runfeng, HAN Honggui. Vulnerability Detection Method Based on Deep Fusion of Multi-dimensional Features from Heterogeneous Contract Graphs [J]. Computer Science, 2025, 52(9): 368-375.
[2]	LI Mengxi, GAO Xindan, LI Xue. Two-way Feature Augmentation Graph Convolution Networks Algorithm [J]. Computer Science, 2025, 52(7): 127-134.
[3]	PIAO Mingjie, ZHANG Dongdong, LU Hu, LI Rupeng, GE Xiaoli. Study on Multi-agent Supply Chain Inventory Management Method Based on Improved Transformer [J]. Computer Science, 2025, 52(6A): 240500054-10.
[4]	LI Yingjian, WANG Yongsheng, LIU Xiaojun, REN Yuan. Cloud Platform Load Data Forecasting Method Based on Spatiotemporal Graph AttentionNetwork [J]. Computer Science, 2025, 52(6A): 240700178-8.
[5]	HUANG Feihu, LI Peidong, PENG Jian, DONG Shilei, ZHAO Honglei, SONG Weiping, LI Qiang. Multi-agent Based Bidding Strategy Model Considering Wind Power [J]. Computer Science, 2024, 51(6A): 230600179-8.
[6]	HOU Lei, LIU Jinhuan, YU Xu, DU Junwei. Review of Graph Neural Networks [J]. Computer Science, 2024, 51(6): 282-298.
[7]	WANG Xiaolong, WANG Yanhui, ZHANG Shunxiang, WANG Caiqin, ZHOU Yuhao. Gender Discrimination Speech Detection Model Fusing Post Attributes [J]. Computer Science, 2024, 51(6): 338-345.
[8]	XIN Yuanxia, HUA Daoyang, ZHANG Li. Multi-agent Reinforcement Learning Algorithm Based on AI Planning [J]. Computer Science, 2024, 51(5): 179-192.
[9]	ZHANG Zebao, YU Hannan, WANG Yong, PAN Haiwei. Combining Syntactic Enhancement with Graph Attention Networks for Aspect-based Sentiment Classification [J]. Computer Science, 2024, 51(5): 200-207.
[10]	SHI Dianxi, HU Haomeng, SONG Linna, YANG Huanhuan, OUYANG Qianying, TAN Jiefu , CHEN Ying. Multi-agent Reinforcement Learning Method Based on Observation Reconstruction [J]. Computer Science, 2024, 51(4): 280-290.
[11]	PAN Lei, LIU Xin, CHEN Junyi, CHENG Zhangtao, LIU Leyuan, ZHOU Fan. Event Prediction Based on Dynamic Graph with Local Data Augmentation [J]. Computer Science, 2024, 51(3): 118-127.
[12]	LIN Huang, LI Bicheng. Aspect-based Sentiment Analysis Based on BERT Model and Graph Attention Network [J]. Computer Science, 2024, 51(11A): 240400018-7.
[13]	LUO Ruiqing, ZENG Kun, ZHANG Xinjing. Curriculum Learning Framework Based on Reinforcement Learning in Sparse HeterogeneousMulti-agent Environments [J]. Computer Science, 2024, 51(1): 301-309.
[14]	YANG Zhizhuo, XU Lingling, Zhang Hu, LI Ru. Answer Extraction Method for Reading Comprehension Based on Frame Semantics and GraphStructure [J]. Computer Science, 2023, 50(8): 170-176.
[15]	XIONG Liqin, CAO Lei, CHEN Xiliang, LAI Jun. Value Factorization Method Based on State Estimation [J]. Computer Science, 2023, 50(8): 202-208.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Graph Attention-based Grouped Multi-agent Reinforcement Learning Method

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0