计算机科学 ›› 2025, Vol. 52 ›› Issue (6A): 240600096-6.doi: 10.11896/jsjkx.240600096

• 人工智能 • 上一篇    下一篇

基于PPO的自适应杂交遗传算法求解旅行商问题

黄傲1, 李敏1, 曾祥光1, 潘云伟1, 张加衡1, 彭倍2   

  1. 1 西南交通大学机械工程学院 成都 610031
    2 电子科技大学机械与电气工程学院 成都 611731
  • 出版日期:2025-06-16 发布日期:2025-06-12
  • 通讯作者: 李敏(min_li@swjtu.edu.cn)
  • 作者简介:(431842867@qq.com)
  • 基金资助:
    四川省科技厅重点研发计划项目(2023YFG0285)

Adaptive Hybrid Genetic Algorithm Based on PPO for Solving Traveling Salesman Problem

HUANG Ao1, LI Min1, ZENG Xiangguang1, PAN Yunwei1, ZHANG Jiaheng1, PENG Bei2   

  1. 1 School of Mechanical Engineering,Southwest Jiaotong University,Chengdu 610031,China
    2 School of Mechanical and Electrical Engineering,University of Electronic Science and Technology of China,Chengdu 611731,China
  • Online:2025-06-16 Published:2025-06-12
  • About author:HUANG Ao,born in 1999,postgraduate,is a member of CCF(No.U4934G).His main research interests include AUVs path planning and so on.
    LI Min,born in 1981,Ph.D,lecturer.His main research interests include machine learning and intelligent control.
  • Supported by:
    Key R&D Program Poject of Sichuan Provincial Department of Science and Technology(2023YFG0285).

摘要: 旅行商问题(Traveling Salesman Problem,TSP)是一个经典的组合优化问题,求解难度较大。传统遗传算法在求解旅行商问题时,参数调节过分依赖经验,同时种群多样性过早减少会导致局部收敛,严重影响算法性能。为此,提出一种自适应杂交遗传算法(Adaptive Hybrid Genetic Algorithm,AHGA),采用深度强化学习对遗传算法的关键参数进行自适应调整。首先,构建了以遗传算法为环境的自适应参数调节模型,采用近端策略优化(Proximal Policy Optimization,PPO)算法来生成控制种群进化的动作策略。其次,在传统遗传算法交叉、变异的基础上增加杂交算子,以提高迭代后期种群的多样性。最后,在不同的TSPLIB公共实例中验证算法的效果和性能。结果表明,该算法明显提高了遗传算法的求解质量和收敛速度,有效避免了遗传算法的局部收敛问题,在解决旅行商问题时优于同类算法。

关键词: 旅行商问题, 遗传算法, 近端策略优化, 杂交算子, 参数自适应

Abstract: Traveling salesman problem(TSP) is a classic combinatorial optimization problem known for its significant computational complexity.Traditional genetic algorithms heavily rely on empirical parameter adjustments when solving the TSP,and premature reduction of population diversity will lead to local convergence,severely impacting algorithm performance.Therefore,this paper proposes an adaptive hybrid genetic algorithm(AHGA) that adjusts the key parameters of genetic algorithms adaptively with reinforcement learning.Firstly,an adaptive parameter adjustment model based on genetic algorithms is constructed.Proximal policy optimization(PPO) algorithm is employed to generate action policies controlling population evolution.Secondly,a hybrid operator is introduced to traditional genetic algorithm crossover and mutation to enhance population diversity in later iterations.Finally,the effectiveness and performance of the algorithm are validated on various TSPLIB public instances.The results demonstrate that the proposed algorithm significantly improves the solution quality and convergence speed of genetic algorithms,effectively avoiding local convergence issues,and outperforms similar algorithms in solving TSP.

Key words: Traveling salesman problem, Genetic algorithm, Proximal policy optimization, Hybrid operator, Parameter adaptation

中图分类号: 

  • TP181
[1]XU W H,WEI C X,ZHANG G R.Genetic Algorithm for Tra-veling Salesman Problem Based on K-Nearest Neighbors Search [J].Journal of Kunming University of Science and Technology(Natural Science),2022,47(1):139-146.
[2]LI C,WANG N,ZHANG C.Hierarchical Mission Planning for Cleaning Photovoltaic Panels Based on Improved Genetic Algorithm[J].Shanghai Jiaotong Daxue Xuebao/Journal of Shanghai Jiaotong University,2021,55(9):1169-74.
[3]WANG S,LIU G,GAO M,et al.Application of improved adaptive genetic algorithm in TDOA location[J].Journal of Systems Engineering and Electronics,2019,41(2):254-258.
[4]XU J,HAN F Q,LIU Q X,et al.Bioinformation Heuristic Genetic Algorithm for Solving TSP[J].Journal of System Simulation,2022,34(8):1811-1819.
[5]CHEN R,YANG B,LI S,et al.A self-learning genetic algorithm based on reinforcement learning for flexible job-shop scheduling problem [J].Computers and Industrial Engineering,2020,149:106778.
[6]KOKSAL AHMED E,LI Z,VEERAVALLI B,et al.Reinforcement learning-enabled genetic algorithm for school bus scheduling [J].Journal of Intelligent Transportation Systems:Technology,Planning,and Operations,2022,26(3):269-83.
[7]SONG Y,WEI L,YANG Q,et al.RL-GA:A ReinforcementLearning-based Genetic Algorithm for Electromagnetic Detection Satellite Scheduling Problem [J].Swarm and Evolutionary Computation,2023,77.
[8]SHENMAIER V.Efficient PTAS for the maximum travelingsalesman problem in a metric space of fixed doubling dimension [J].Optimization Letters,2022,16(7):2115-2122.
[9]SCHULMAN J,WOKSKI F,DHARIWAL P,et al.Proximalpolicy optimization algorithms [J].arXiv:1707.06347,2017.
[10]QUEVEDO J,ABDELATTI M,IMANI F,et al.Using rein-forcement learning for tuning genetic algorithms [C]//EGCCO 2021.France:Association for Computing Machinery,2021.
[11]JIAN-TING C,YANG X.Survey of Unstable Gradients in Deep Neural Network Training [J].Journal of Software,2018,29(7):2071-91.
[12]ZENG D,YAN T,ZENG Z,et al.A Hyperparameter Adaptive Genetic Algorithm Based on DQN [J].Journal of Circuits,Systems and Computers,2023,32(4).
[13]DORIGO M,GAMBARDELLA L M.Ant colonies for the tra-velling salesman problem [J].Biosystems,1997,43(2):73-81.
[14]GLOVER F W.Tabu Search - Part I [J].INFORMS J Comput,1989,1:190-206.
[15]ZHANG C,TU L,WANG J.Application of self-adaptive antcolony optimization in TSP [J].Journal of Central South University(Science and Technology),2015,46:2944-2949.
[16]DAI H,KHALIL E B,ZHANG Y,et al.Learning combinatorial optimization algorithms over graphs [C]//Proceedings of the 31st International Conference on Neural Information Processing Systems.2017:6351-6361.
[17]WU Y,SONH W,CAO Z,et al.Learning Improvement Heuristics for Solving Routing Problems [J].IEEE Transactions on Neural Networks and Learning Systems,2022,33(9):5057-5069.
[18]GUNDUZ M,KIRAN M S,OZCEYLAN E.A hierarchic ap-proach based on swarm intelligence to solve the traveling salesman problem [J].Turkish Journal of Electrical Engineering and Computer Sciences,2015,23(1):103-117.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!