Computer Science ›› 2023, Vol. 50 ›› Issue (6): 266-273.doi: 10.11896/jsjkx.230300044

• Artificial Intelligence • Previous Articles     Next Articles

Self Reconfiguration Algorithm of Modular Robot Based on Swarm Agent Deep Reinforcement Learning

WANG Hanmo, ZHENG Shijie, XU Ruonan, GUO Bin, WU Lei   

  1. School of Computer Science,Northwestern Polytechnical University,Xi'an 710072,China
  • Received:2023-03-04 Revised:2023-04-13 Online:2023-06-15 Published:2023-06-06
  • About author:WANG Hanmo,born in 2001,undergraduate,is a member of China Computer Federation. His main research interest is modular robot.GUO Bin,born in 1980,Ph.D,professor,Ph.D supervisor,is a member of China Computer Federation.His main research interests include ubiquitous computing and crowd intelligence with the deep fusion of human,machine and things.
  • Supported by:
    National Science Fund for Distinguished Young Scholars(62025205) and National Natural Science Foundation of China(62032020,62102317).

Abstract: Modular robots are composed of a certain number of standard modules with independent functions.At present,self reconfiguration is a hot and difficult problem in the field of modular robot research.For complex problems,the traditional graph theory algorithm or search algorithm cannot find its optimal solution in polynomial time,and the complexity increases exponentially with the increase of the number of modules.From the perspective of deep reinforcement learning of swarm agents,the research regards each isomorphic module as a single agent with learning and perception ability,and proposes a modular robot self reconfiguration algorithm based on QMIX.For this algorithm,a new type of reward function is designed and the parallel movement of the agent on the basis of limiting the action space of the agents is realized,which solves the problem of coordination and cooperation between multiple agents to a certain extent,thereby realizing the transition from the initial configuration to the target configuration.In addition,in experiments,9 modules are taken as examples to compare the success rate and average steps between this algorithm and the traditional search algorithm based on A*.Experimental results show that when the time step limit is reasonable,the success rate of the modular robot self-reconfiguration algorithm based on QMIX can reach more than 95%,and the average number of steps of the two algorithms is about 12 steps.The QMIX self-reconfiguration algorithm can approach the effect of the traditional algorithm.

Key words: Modular robot, Self reconfiguration, Swarm agent collaboration, Deep reinforcement learning, Configuration space and action space

CLC Number: 

  • TP242.6
[1]DAI Y,ZHANG Q H,GAO Y F,et al.Overview of self-reconfigurable modular robot module design[J].Journal of Harbin University of Technology,2021,26(5):34-43.
[2]SUN X,GE W,WANG X,et al.A reconfiguration approach for self-reconfigurable modular robot using assisted modules[C]//IEEE International Conference on Mechatronics & Automation.IEEE,2015:1436-1441.
[3]AHMADZADEH H,MASEHIAN E.A fluid dynamics ap-proach for self-reconfiguration planning of modular robots[C]//RSI International Conference on Robotics & Mechatro-nics.IEEE,2016:139-145.
[4]PARHAMI P,MORADI H,ASADPOUR M,et al.Generatingan efficient hub graph for self-reconfiguration planning in modular robots[C]//Robotics and Mechatronics (ICROM),2015 3rd RSI International Conference on.IEEE,2015:476-481.
[5]LIU Y J,YU M J,YE Z P,et al.Path planning for self-reconfigurable modular robots:a survey[J].Scientia Sinica Informationis,2018,48(2):143-176.
[6]TAREK A,NOUREDDINED,YVES D,et al.Genetic Programming-based Self-reconfiguration Planning for Metamorphic Robot[J].International Journal of Automation and Computing,2018,15(4):57-68.
[7]WALTER J E.Sensor-Driven Algorithm for Self-Reconfigura-tion of Modular Robots[C]//2018 International Conference on Reconfigurable Mechanisms and Robots.2018:1-7.
[8]LIU C,WHITZER M,YIM M.A Distributed Reconfiguration Planning Algorithm for Modular Robots[J].IEEE Robotics and Automation Letters,2019,4(4):4231-4238.
[9]NAZ A,PIRANDA B,GOLDSTEIN S C,et al.A distributed self-reconfiguration algorithm for cylindrical lattice-based modular robots[C]//IEEE International Symposium on Network Computing & Applications.IEEE,2016.
[10]LUO H,LI M,LIANG G,et al.An Obstacle-crossing Strategy Based on the Fast Self-reconfiguration for Modular Sphere Robots[C]//IEEE/RSJ International Conference on Intelligent Robots and Systems.IEEE,2020.
[11]GERBL M,GERSTMAYR J.Self-reconfiguration planning ofadaptive modular robots with triangular structure based on extended binary trees[C]//IEEE/RSJ International Conference on Intelligent Robots and Systems.IEEE,2020:3312-3319.
[12]BASSIL J,PIRANDA B,MAKHOUL A,et al.RePoSt:Distri-buted Self-Reconfiguration Algorithm for Modular Robots Based on Porous Structure [C]//IEEE/RSJ International Conference on Intelligent Robots and Systems.2022:12651-12658.
[13]BUCHI B,MABED H,FRÉDÉRIC L,et al.Translation based Self Reconfiguration Algorithm for 6-lattice Modular Robots[C]//International Symposium on Parallel and Distributed Computing.IEEE,2021:49-56.
[14]ZHANG Y Z,WANG W H,HUANG P F,et al A Self Reconstruction Planning Method for Heterogeneous Modular Robots Based on Reinforcement Learning Algorithm:CN110297490A [P] 2019.
[15]WITZ F,BUCHI B,MABED H,et al.Deep Learning for the selection of the best modular robots self-reconfiguration algorithm[C]//2022 IEEE Symposium on Computers and Communications.Rhodes,Greece,2022:1-6.
[16]LI W K,YUE H W,WANG H M,et al.Modular self-reconfigurable robot formation based on improved reinforcement learning[J].Computing Technology and Automation,2022,41(3):6-13.
[17]VOLODYMYR M,KORAY K,DAVID S,et al.Playing Atariwith Deep Reinforcement Learning[J].arXiv:1312.5602,2013.
[18]SUNEHAG P,LEVER G,GRUSLYS A,et al.Value-Decomposition Networks For Cooperative Multi-Agent Learning[J].arXiv:1706.05296,2017.
[19]RASHID T,SAMVELYAN M,DE W,et al.Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning[J].Journal of Machine Learning Resarch,2020,21(1):7234-7284.
[20]ZHANG Y,WANG Q,KANG Y L,et al.Summary of key technologies and research prospects of modular self-reconfigurable robots[J].Journal of Hebei University of Science and Technology,2022,43(6):602-612.
[1] ZHANG Qiyang, CHEN Xiliang, CAO Lei, LAI Jun, SHENG Lei. Survey on Knowledge Transfer Method in Deep Reinforcement Learning [J]. Computer Science, 2023, 50(5): 201-216.
[2] YU Ze, NING Nianwen, ZHENG Yanliu, LYU Yining, LIU Fuqiang, ZHOU Yi. Review of Intelligent Traffic Signal Control Strategies Driven by Deep Reinforcement Learning [J]. Computer Science, 2023, 50(4): 159-171.
[3] XU Linling, ZHOU Yuan, HUANG Hongyun, LIU Yang. Real-time Trajectory Planning Algorithm Based on Collision Criticality and Deep Reinforcement Learning [J]. Computer Science, 2023, 50(3): 323-332.
[4] Cui ZHANG, En WANG, Funing YANG, Yong jian YANG , Nan JIANG. UAV Frequency-based Crowdsensing Using Grouping Multi-agentDeep Reinforcement Learning [J]. Computer Science, 2023, 50(2): 57-68.
[5] WEI Nan, WEI Xianglin, FAN Jianhua, XUE Yu, HU Yongyang. Backdoor Attack Against Deep Reinforcement Learning-based Spectrum Access Model [J]. Computer Science, 2023, 50(1): 351-361.
[6] HUANG Yuzhou, WANG Lisong, QIN Xiaolin. Bi-level Path Planning Method for Unmanned Vehicle Based on Deep Reinforcement Learning [J]. Computer Science, 2023, 50(1): 194-204.
[7] ZHANG Qiyang, CHEN Xiliang, ZHANG Qiao. Sparse Reward Exploration Method Based on Trajectory Perception [J]. Computer Science, 2023, 50(1): 262-269.
[8] YU Bin, LI Xue-hua, PAN Chun-yu, LI Na. Edge-Cloud Collaborative Resource Allocation Algorithm Based on Deep Reinforcement Learning [J]. Computer Science, 2022, 49(7): 248-253.
[9] LI Meng-fei, MAO Ying-chi, TU Zi-jian, WANG Xuan, XU Shu-fang. Server-reliability Task Offloading Strategy Based on Deep Deterministic Policy Gradient [J]. Computer Science, 2022, 49(7): 271-279.
[10] XIE Wan-cheng, LI Bin, DAI Yue-yue. PPO Based Task Offloading Scheme in Aerial Reconfigurable Intelligent Surface-assisted Edge Computing [J]. Computer Science, 2022, 49(6): 3-11.
[11] HONG Zhi-li, LAI Jun, CAO Lei, CHEN Xi-liang, XU Zhi-xiong. Study on Intelligent Recommendation Method of Dueling Network Reinforcement Learning Based on Regret Exploration [J]. Computer Science, 2022, 49(6): 149-157.
[12] LI Peng, YI Xiu-wen, QI De-kang, DUAN Zhe-wen, LI Tian-rui. Heating Strategy Optimization Method Based on Deep Learning [J]. Computer Science, 2022, 49(4): 263-268.
[13] OUYANG Zhuo, ZHOU Si-yuan, LYU Yong, TAN Guo-ping, ZHANG Yue, XIANG Liang-liang. DRL-based Vehicle Control Strategy for Signal-free Intersections [J]. Computer Science, 2022, 49(3): 46-51.
[14] CAI Yue, WANG En-liang, SUN Zhe, SUN Zhi-xin. Study on Dual Sequence Decision-making for Trucks and Cargo Matching Based on Dual Pointer Network [J]. Computer Science, 2022, 49(11A): 210800257-9.
[15] DAI Shan-shan, LIU Quan. Action Constrained Deep Reinforcement Learning Based Safe Automatic Driving Method [J]. Computer Science, 2021, 48(9): 235-243.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!