计算机科学 ›› 2025, Vol. 52 ›› Issue (2): 279-290.doi: 10.11896/jsjkx.240100133

• 计算机网络 • 上一篇    下一篇

基于强化学习的完全分布式事件驱动二分一致性算法

蔡玉良1, 吕春慧1, 何强2, 于波3, 陈东岳4, 王友童1, 王强1, 刘宇轩1, 赵婧婧1   

  1. 1 辽宁大学数学与统计学院 沈阳 110036
    2 东北大学医学与生物信息工程学院 沈阳 110016
    3 中国科学院沈阳计算技术研究所 沈阳 110168
    4 东北大学信息科学与工程学院 沈阳 110819
  • 收稿日期:2024-01-16 修回日期:2024-09-13 出版日期:2025-02-15 发布日期:2025-02-17
  • 通讯作者: 何强(heqiangcai@gmail.com)
  • 作者简介:(ylcaivv@163.com)
  • 基金资助:
    国家重点研发计划(2021YFB3300900);国家自然科学基金青年科学基金(62303202);中国博士后科学基金第75批面上项目(2024M753407);辽宁省自然科学基金(2023-BS-082);辽宁省社会科学规划基金项目(23C10140012)

Fully Distributed Event Driven Bipartite Consensus Algorithm Based on Reinforcement Learning

CAI Yuliang1, LYU Chunhui1, HE Qiang2, YU Bo3, CHEN Dongyue4, WANG Youtong1, WANG Qiang1, LIU Yuxuan1, ZHAO Jingjing1   

  1. 1 School of Mathematics and Statistics,Liaoning University,Shenyang 110036,China
    2 College of Medicine and Biological Information Engineering,Northeastern University,Shenyang 110016,China
    3 Shenyang Institute of Computing Technology,Chinese Academy of Sciences Co.,Ltd.,Shenyang 110168,China
    4 College of Information Science and Engineering,Northeastern University,Shenyang 110819,China
  • Received:2024-01-16 Revised:2024-09-13 Online:2025-02-15 Published:2025-02-17
  • About author:CAI Yuliang,born in 1988,Ph.D,asso-ciate professor.Her main research inte-rests include network control,mul-tiagent systems and machine learning.
    HE Qiang,born in 1991,Ph.D,associate professor,is a member of CCF(No.D3158M).His main research interests include social networks and machine learning.
  • Supported by:
    National Key Research and Development Program of China(2021YFB3300900),Youth Science Foundation of National Natural Science Foundation of China(62303202),75th Batch of General Projects of China Postdoctoral Science Foundation(2024M753407),Natural Science Foundation of Liaoning Province,China(2023-BS-082) and Liaoning Province Social Science Planning Fund Project(23C10140012).

摘要: 使用强化学习(Reinforcement Learning,RL)方法和基于事件驱动的完全分布式控制策略来研究系统模型信息未知的多智能体系统(MASs)的二分一致性问题。首先,基于状态阈值和时间阈值提出了一种混合事件触发机制,用于减少智能体间的通信频率。其次,利用局部采样的状态信息设计了一个自适应事件触发一致性控制协议,使所有追随智能体的二分一致性误差最终趋于零。通过排除有限时间内的芝诺(Zeno)行为,证实了上述事件触发机制的有效性。然后,基于RL方法提出了一种无模型算法来获得反馈增益矩阵,并在模型信息未知的情况下实现了自适应事件触发控制策略的构建。与现有的相关工作不同,这种基于RL的事件触发自适应控制算法只依赖于局部采样的状态信息,与任何模型信息或全局网络信息无关。此外,将上述结果扩展到切换拓扑情形,这更具挑战性,因为状态估计在以下两种情况下更新:1)交互图切换时;2)事件触发机制满足时。最后,通过实例验证了上述自适应事件触发控制算法的有效性。

关键词: 强化学习, 事件驱动, 完全分布式控制, 多智能体系统, 二分一致性

Abstract: Reinforcement learning(RL) methods and fully distributed event driven control strategies are used to study the bipartite consensus problem of multi-agent systems(MASs) with unknown system model information.Firstly,a hybrid event triggered mechanism based on state threshold and time threshold is proposed to reduce the communication frequency between intelligent agents.Secondly,an adaptive event triggered consensus control protocol is designed using locally sampled state information,resulting in the consensus error of all following agents eventually approaching zero.The effectiveness of the above event triggered mechanism is confirmed by excluding Zeno behavior within a limited time.Then,based on the RL method,a model free algorithm is proposed to obtain the feedback gain matrix,and an adaptive event triggered control strategy is constructed in the presence of unknown model information.Unlike existing related works,the RL-based event triggered adaptive control algorithm only relies on locally sampled state information and is independent of any model information or global network information.In addition,we extend the above results to the switching topology scenario,which is more challenging because the state estimation is updated in the following two situations:1)when the interaction graph switches or 2)when the event triggering mechanism is satisfied.Finally,the effectiveness of the adaptive event triggered control algorithm is verified through examples.

Key words: Reinforcement learning, Event-driven, Fully distributed control, Multi-agent systems, Bipartite consensus

中图分类号: 

  • TP393
[1]SU H S,MIAO S X.Consensus on Directed Matrix-weightedNetworks[J].IEEE Transactions on Automatic Control,2023,68(4):2529-2535.
[2]JIN X Z,CHE W W,WU Z G,et al.Adaptive Consensus and Circuital Implementation of a Class of Faulty Multiagent Systems[J].IEEE Transactions on Systems,Man,and Cybernetics:Systems,2022,52(1):226-237.
[3]TIAN L,DONG X W,ZHAO Q L.Distributed Adaptive Time-varying Output Formation Tracking for Heterogeneous Swarm Systems[J].Acta Automatica Sinica,2021,47(10):2386-2401.
[4]CAI Y L,ZHANG H G,WANG Y C,et al.Adaptive Bipartite Fixed-time Time-varying Output Formation-containment Tracking of Heterogeneous Linear Multiagent Systems[J].IEEE Transactions on Neural Networks and Learning Systems,2022,33(9):4688-4698.
[5]LI T S,BAI W W,LIU Q,et al.Distributed Fault-tolerant Containment Control Protocols for the Discrete-time Multiagent Systems via Reinforcement Learning Method[J].IEEE Transactions on Neural Networks and Learning Systems,2023,34(8):3979-3991.
[6]ZHAO X Y,ZHOU S L,WANG S L.Formation Containment Control of Multi-UAV System under Switching Topology[J].Computer Science,2020,47(S1):577-582.
[7]WANG B H,CHEN W S,ZHANG B.Semi-global RobustTracking Consensus for Multi-agent Uncertain Systems with Input Saturation via Metamorphic Low-Gain Feedback[J].Automatica,2019,103:363-373.
[8]WANG X L,SU H S,CHEN M Z Q,et al.Observer-based Robust Coordinated Control of Multiagent Systems with Input Saturation[J].IEEE Transactions on Neural Networks and Learning Systems,2018,29(5):1933-1946.
[9]XIN Y X,HUA D Y,ZHANG L.Multi-agent ReinforcementLearning Algorithm Based on Planning[J].Computer Science,2024,51(5):179-192.
[10]LI Z J,WU J,ZHAN X S,et al.Distributed Adaptive Prede-fined-time Bipartite Containment Algorithm for Nonlinear Multi-agent Systems ith Actuator Faults[J].IEEE Transactions on Circuits and Systems II:Express Briefs,2023,70(6):2141-2145.
[11]LI Y H,CUI L Z,BU X H,et al.Model-free Adaptive Cluster Consensus Control for Nonlinear Multi-agent Systems[J]. Control and Decision,2024,39(1):345-352.
[12]TIAN Y P,LIU C L.Consensus of Multi-agent Systems with Diverse Input and Communication Delays[J].IEEE Transactions on Automatic Control,2008,53(9):2122-2128.
[13]YE Y Y,SU H S.Consensus of Delayed Fractional-order Multiagent Systems with Intermittent Sampled Data[J].IEEE Transactions on Industrial Informatics,2020,16(6):3828-3837.
[14]ZHAO M Y,YE J.Synchronization of Uncertain Complex Networks with Sapmpled-data and Input Saturation[J].Computer Science,2021,48(S2):481-484.
[15]LIU J,WU Y B,XUE L,et al.A New Intermittent Control Approach to Practical Fixed-time Consensus with Input Delay[J].IEEE Transactions on Circuits and Systems II:Express Briefs,2023,70(6):2186-2190.
[16]ZUO Z Q,JI J W,ZHANG Z C,et al.Consensus of Multi-agent Systems with Asymmetric Input Saturation over Directed Graph[J].IEEE Transactions on Circuits and Systems II:Express Briefs,2023,70(4):1515-1519.
[17]WANG Y J,LI J D,LI D Q.Distributed Subgradient Optimization Algorithm with Communication Delays for Multi-agent Switched Networks[J].Computer Science,2019,46(7):81-85.
[18]DIMAROGONAS D V,FRAZZOLI E,JOHANSSON K H.Distributed Event-triggered Control for Multi-agent Systems[J].IEEE Transactions on Automatic Control,2012,57(5):1291-1297.
[19]NOWZARI C,GARCIA E,CORT′ES J.Event-triggered Communication and Control of Networked Systems for Multi-agent Consensus[J].Automatica,2019,105:1-27.
[20]YI X L,YANG T,WU J F,et al.Distributed Event-triggered Control for Global Consensus of Multi-agent Systems with Input Saturation[J].Automatica,2019,100:1-9.
[21]DU C K,LIU X D,REN W,et al.Finite-time Consensus for Linear Multiagent Systems via Event-triggered Strategy without Continuous Communication[J].IEEE Transactions on Control of Network Systems,2020,7(1):19-29.
[22]WANG L,HU A H,JIANG Z X.Dynamic event-triggered Consensus of Multi-agent Systems under Cyberattacks[J].Control and Decision,2023,38(5):1295-1302.
[23]HU W F,LIU L,FENG G.Output Consensus of Heterogeneous Linear Multi-agent Systems by Distributed Event-triggered/self-triggered Strategy[J].IEEE Transactions on Cybernetics,2017,47(8):1914-1924.
[24]SU H S,WANG X,ZENG Z G.Consensus of Second-order Hybrid Multiagent Systems by Event-triggered Strategy[J].IEEE Transactions on Cybernetics,2020,50(11):4649-4657.
[25]LI Z K,WEN G H,DUAN Z S,et al.Designing Fully Distributed Consensus Protocols for Linear Multi-agent Systems with Directed Graphs[J].IEEE Transactions on Automatic Control,2015,60(4):1152-1157.
[26]JIANG W,WEN G G,PENG Z X,et al.Fully Distributed Formation-containment Control of Heterogeneous Linear Multi-agent Systems[J].IEEE Transactions on Automatic Control,2019,64(9):3889-3896.
[27]WANG X L,SU H S,WANG X F,et al.Fully DistributedEvent-triggered Semiglobal Consensus of Multi-agent Systems with Input Saturation[J].IEEE Transactions on Industrial Electronics,2017,64(6):5055-50064.
[28]YUAN S Z,LIU Z L,ZHENG L H.Ship Berthing Based onEvent-triggered Adaptive Horizon MPC[J].Control and Decision,2024,39(1):336-344.
[29]CHEN S M,JIANG G L,ZHANG Z.Bipartite Practical Consensus Control of Multi-agent Systems with Communication Constraints[J].Acta Automatica Sinica,2022,48(5):1318-1326.
[30]ZHAO L N,MA H J,XU L X,et al.Observer-Based Adaptive Sampled-data Event-triggered Distributed Control for Multi-agent Systems[J].IEEE Transactions on Circuits and Systems II:Express Briefs,2020,67(1):97-101.
[31]MA C Q,XIE L H.Necessary and Sufficient Conditions for Leader-following Bipartite Consensus with Measurement Noise[J].IEEE Transactions on Systems,Man,and Cybernetics:Systems,2020,50(5):1976-1981.
[32]ZHU Y R,LI S L,MA J Y,et al.Bipartite Consensus in Networks of Agents with Antagonistic Interactions and Quantization[J].IEEE Transactions on Circuits and Systems II:Express Briefs,2018,65(12):2012-2016.
[33]WEN G H,WANG H,YU X H,et al.Bipartite Tracking Consensus of Linear Multi-agent Systems with a Dynamic Leader[J].IEEE Transactions on Circuits and Systems II:Express Briefs,2018,65(9):1204-1208.
[34]YAN H Y,LIU X Y,CAO J D.Bipartite Quasi-consensus of Heterogeneous Multi-agent Systems Based on Neural Network Approximation[J].Acta Automatica Sinica,2023,38(5):1312-1318.
[35]YAGHMAIE F A,SU R,LEWIS F L,et al.Bipartite and Cooperative Output Synchronizations of Linear Heterogeneous Agents:A Unified Framework[J].Automatica,2017,80:172-176.
[36]LI J J,CHEN X.Event-triggered Bipartite Consensus for Multi-agent Systems Associated with Signed Graphs[C]//2018 33rd Youth Academic Annual Conference of Chinese Association of Automation.Nanjing,China,2018:988-992.
[37]ZUO Z Q,MA J J,WANG Y J.Layered Event-triggered Control for Group Consensus with Both Competition and Cooperation Interconnections[J].Neurocomputing,2018,275(31):1964-1972.
[38]PAN L L,SHAO H B,LI D W,et al.Event-triggered Consensus Problem of General Multi-agent System on Signed Networks[C]//2018 15th International Conference on Control,Automation,Robotics and Vision(ICARCV).Singapore,2018:1069-1074.
[39]ABOUHEAF M L,LEWIS F L,MAHMOUD M S,et al.Discrete-time Dynamic graphical Games:Model-free Reinforcement Learning Solution[J].Control Theory and Technology,2015,13(1):55-69.
[40]WANG W,CHEN X.Model-free Optimal Containment Control of Multi-agent Systems Based on Actor-critic Framework[J].Neurocomputing,2018,314(7):242-250.
[41]RIZVI S A A,LIN Z L.Output Feedback Reinforcement Learning Based Optimal Output Synchronisation of Heterogeneous Discrete-time Multi-agent Systems[J].IET Control Theory and Applications,2019,13(17):2866-2876.
[42]LONG M K,SU H S,WANG X L,et al.An Iterative Q-learning Based Global Consensus of Discrete-time Saturated Multi-agent Systems[J].Chaos Interdisciplinary Journal of Nonlinear Science,2019,29(10):103127.
[43]LONG M K,SU H S,ZENG Z G.Output-feedback Global Consensus of Discrete-time Multiagent Systems Subject to Input Saturation via Q-learning Method[J].IEEE Transactions on Cybernetics,2022,52(3):1661-1670.
[44]QIN Z H,LI N,LIU X T.Overview of Research on Model-freeReinforcement Learning[J].Computer Science,2021,48(3):180-187.
[45]XU W Y,HO D W C,LI L L,et al.Event-triggered Schemes on Leader-following Consensus of General Linear Multiagent Systems under Different Topologies[J].IEEE Transactions on Cybernetics,2017,47(1):212-223.
[46]CHENG T H,KAN Z,KLOTZ J R,et al.Event-triggered Control of Multiagent Systems for Fixed and Time-varying Network Topologies[J].IEEE Transactions on Automatic Control,2017,62(10):5365-5371.
[47]ALTAFINI C.Consensus Problems on Networks with Antagonistic Interactions[J].IEEE Transactions on Automatic Control,2013,58(4):935-946.
[48]CAI Y L,ZHANG H G,LIU Y.et al.Distributed Bipartite Finite-time Event-triggered Output Consensus for Heterogeneous Linear Multi-agent Systems under Directed Signed Communication Topology[J].Applied Mathematics and Computation,2020,378:125162.
[49]AHMED I,REHAN M,IQBAL N,et al.A Novel Event-Triggered Consensus Approach for Generic Linear Multi-Agents Under Heterogeneous Sector-Restricted Input Nonlinearities[J].IEEE Transactions on Network Science and Engineering,2023,10(3):1648-1658.
[50]WANG X L,SU H S.Completely model-free RL-based consensus of continuous-time multi-agent systems[J].Applied Mathematics and Computation,2020,382:125312.
[51]KHALIL H K,GRIZZLE J.Nonlinear Systems Third Edition[M].Upper Saddle River Nj Prentice Hall Inc,2002.
[52]WANG X L,SU H S.Completely Model-free RL-based Consensus of Continuous-time Multi-agent Systems[J].Applied Mathematics and Computation,2020,382:125312.
[53]BIAN T,JIANG Z P.Value Iteration and Adaptive DynamicProgramming for Data-driven Adaptive Optimal Control Design[J].Automatica,2016,71:348-360.
[54]LI Z K,REN W,LIU X D,et al.Distributed Consensus of Linear Multiagent Systems with Adaptive Dynamic Protocols[J].Automatica,2013,49(7):1986-1995.
[55]LIU X D,DU C K,LIU H K.Decentralized Event-triggeredOutput Consensus for Heterogeneous Multi-agent Systems with General Linear Dynamics[C]//2017 36th Chinese Control Conference(CCC).IEEE,2017:8282-8287.
[56]HU W,LIU L.Cooperative Output Regulation of Heterogeneous Linear Multi-agent Systems by Event-triggered Control[J].IEEE Transactions on Cybernetics,2017,47(1):105-116.
[57]BIAN T,JIANG Z P.Value Iteration and Adaptive DynamicProgramming for Data-driven Adaptive Optimal Control Design[J].Automatica,2016,71:348-360.
[58]BERNSTEIN D S,Matrix Mathematics:Theory,Facts,and Formulas[M].Princeton,NJ:Princeton University Press,2009.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!