基于深度强化学习的在线并行SDN路由优化算法研究

doi:10.11896/jsjkx.240900018

Abstract

Abstract: The routing behavior of traditional SDN traffic engineering models based on deep reinforcement learning(DRL) is often unpredictable,and the traditional DRL-based routing scheme is unreliable if it simply applies the DRL algorithm to the communication network system.This paper proposes an online parallel SDN routing optimization algorithm based on DRL,so as to reliably utilize the trial-and-error DRL routing algorithm to improve network performance.The algorithm uses a combination of online parallel routing decision-making and offline training in the SDN framework to solve the SDN routing optimization problem.This method can alleviate the reliability issues arising from the deep reinforcement learning model’s lack of convergence and the exploration process.To a certain extent,it can also alleviate the negative impact of the unexplainability of the deep reinforcement lear-ning intelligent routing model and the unreliability of routing behavior under network emergencies.This paper evaluates the performance of the online parallel SDN routing optimization algorithm by extensive experiments on a real network topology.The experimental results show that the network performance of the proposed algorithm is better than the traditional DRL-based routing algorithm and OSPF algorithm.

Key words: Software-defined network, Deep reinforcement learning, Routing optimization

CLC Number:

TP393.0

WU Zongming, CAO Jijun, TANG Qiang. Online Parallel SDN Routing Optimization Algorithm Based on Deep Reinforcement Learning[J].Computer Science, 2025, 52(6A): 240900018-9.

References

[1]KREUTZ D,RAMOS F M,VERISSIMO P E,et al.Software-defined networking:A comprehensive survey[J].Proceedings of the IEEE,2014,103(1):14-76.
[2]LIU Y C,ZHANG J N.Service Function Chain EmbeddingMeets Machine Learning:Deep ReinforcementLearning Approach[J].IEEE Transactions on Networking and Service Management,2024,21(3):3465-3481.
[3]WANG H N,LIU N,ZHANG Y Y,et al.Deep Reinforcement Learning:A Survey[J].Frontiers of Information Technology & Electronic Engineering,2020,21(12):1726-1744.
[4]AMIN R,ROJAS E,AQDUS A,et al.A survey on MachineLearning Techniques for Routing Optimization in SDN[J].IEEE Access,2021:104582-104611.
[5]LAROCHE R,TRICHELAIR P.Safe Policy Improvement with Baseline Bootstrapping[J].arXiv:1712.06924,2017
[6]YAO H P,MAI T L,XU X B,et al.NetworkAI:An Intelligent Network Architecture for Self-Learning Control Strategies in Software Defined Networks[J].IEEE Internet of Things Journal,2018,5(6):4319-4327.
[7]JALIL S Q,REHMANI M H,CHALUP S.DQR:Deep Q-Routing in Software Defined Networks[C]//International Joint Conference on Neural Networks.Glasgow:IEEE,2020:1-8.
[8]YU C,LAN J,GUO Z,et al.DROM:Optimizing the Routing in Software-Defined Networks With Deep Reinforcement Learning[J].IEEE Access,2018,6:64533-64539.
[9]SUN P,HU Y,LAN J,et al.TIDE:Time-relevant deep rein-forcement learning for routing optimization[J].Future Generation Computer Systems,2019,99:401-409.
[10]XU Z,TANG J,MENG J,et al.Experience-driven Networking:A deep reinforcement learning based approach[C]//IEEE INFOCOM 2018-IEEE Conference on Computer Communications.Honolulu:IEEE,2018:1871-1879.
[11]CHEN Y R,REZAPOUR A,TZENG W G,et al.RL-Routing:An SDN Routing Algorithm Based on Deep Reinforcement Learning[J].IEEE Transactions on Network Science and Engineering,2020,7(4):3185-3199.
[12]SUN P,GUO Z,LAN J,et al.ScaleDRL:A Scalable Deep Reinforcement Learning Approach for Traffic Engineering in SDN with Pinning Control[J].Computer Networks,2021,190:107891.
[13]WANG X F,CHEN G.Pinning control of scale-free dynamical networks[J].Physica A:Statistical Mechanics and Its Applications,2002,310(43528):521-531.
[14]SUN P H ,GUO Z H,LI J F,et al.Enabling Scalable Routing in Software-Defined Networks With Deep Reinforcement Learning on Critical Nodes[J].IEEE-ACM Transactions on Networking,2022,30(2):629-640.
[15]ZHOU W,JIANG X,LUO Q S,et al.AQROM:A quality of service aware routing optimization mechanism based on asynchronous advantage actor-critic in software-defined networks[J].Digital Communications and Networks,2024,10(5):1405-1414.
[16]HE Q,WANG Y,WANG X W,et al.Routing OptimizationWith Deep Reinforcement Learning in Knowledge Defined Networking[J].IEEE Transactions on Mobile Computing,2024,23(2):1444-1455.
[17]SHEN R.Valiant Load-Balancing: Building Networks That Can Support All Traffic Matrices[M]//Algorithms for Next Generation Networks.Computer Communications and Networks.London:Springer,2010.
[18]AlSHALABI L,SHAABAN Z.Normalization as a Preproces-sing Engine for Data Mining and the Approach of Preference Matrix[C]//2006 International Conference on Dependability of Computer Systems.Szklarska:IEEE,2006.
[19]HAARNOJA T,ZHOU A,ABBEEL P,et al.Soft Actor-Critic:Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor[J].arXiv:1801.01290,2018.
[20]WINSTEIN K,BALAKRISHNAN H.TCP ex machina:com-puter-generated congestion control[J].ACM SIGCOMM Computer Communication Review,2013,43(4):123-134.
[21]VARGA A,HORNIG R.An overview of the OMNeT++ simu-lation environment[C]//Proceedings of the 1st international conference on Simulation tools and techniques for communications.Brussels:ISCT,2008:1-10.
[22]KNIGHT S,NGUYEN H X,FALKNER N,et al.The Internet topology zoo[J].IEEE Journal on Selected Areas in Communications,2011,29(9):1765-1775.
[23]ROUGHAN M.Simplifying the synthesis of internet traffic matrices[J].Computer Communication Review,2005,35(5):93-96.
[24]TUNE P,ROUGHAN M.Patiotemporal Traffic Matrix Synthesis[J].ACM SIGCOMM Computer Communication Review,2015,45(5):579-592.
[25]KHAN A A,ZAFRULLAH M,HUSSAIN M,et al.Perform-ance analysis of OSPF and hybrid networks[C]//International Symposium on Wireless Systems & Networks.Lahorel.IEEE,2017:1-4.
[26]AlMASAN P,SUÁREZVARELA J,RUSSEK K,et al.Deep reinforcement learning meets graph neural networks:Exploring a routing optimization use case[J].Computer Communications,2022,196:184-194.

Related Articles 15

[1]	WANG Chenyuan, ZHANG Yanmei, YUAN Guan. Class Integration Test Order Generation Approach Fused with Deep Reinforcement Learning andGraph Convolutional Neural Network [J]. Computer Science, 2025, 52(6): 58-65.
[2]	ZHAO Xuejian, YE Hao, LI Hao, SUN Zhixin. Multi-AGV Path Planning Algorithm Based on Improved DDPG [J]. Computer Science, 2025, 52(6): 306-315.
[3]	WANG Panxiang, CUI Yunhe, SHEN Guowei, GUO Chun, CHEN Yi, QIAN Qing. EvoTrace:A Lightweight In-band Network Telemetry Method Based on Nonlinear Embedding and Batch Processing [J]. Computer Science, 2025, 52(5): 291-298.
[4]	LI Yuanbo, HU Hongchao, YANG Xiaohan, GUO Wei, LIU Wenyan. Intrusion Tolerance Scheduling Algorithm for Microservice Workflow Based on Deep Reinforcement Learning [J]. Computer Science, 2025, 52(5): 375-383.
[5]	ZHENG Longhai, XIAO Bohuai, YAO Zewei, CHEN Xing, MO Yuchang. Graph Reinforcement Learning Based Multi-edge Cooperative Load Balancing Method [J]. Computer Science, 2025, 52(3): 338-348.
[6]	DU Likuan, LIU Chen, WANG Junlu, SONG Baoyan. Self-learning Star Chain Space Adaptive Allocation Method [J]. Computer Science, 2025, 52(3): 359-365.
[7]	HUO Xingpeng, SHA Letian, LIU Jianwen, WU Shang, SU Ziyue. Windows Domain Penetration Testing Attack Path Generation Based on Deep Reinforcement Learning [J]. Computer Science, 2025, 52(3): 400-406.
[8]	WANG Yijie, GAO Guoju, SUN Yu'e, HUANG He. Flow Cardinality Estimation Method Based on Distributed Sketch in SDN [J]. Computer Science, 2025, 52(2): 268-278.
[9]	XU Donghong, LI Bin, QI Yong. Task Scheduling Strategy Based on Improved A2C Algorithm for Cloud Data Center [J]. Computer Science, 2025, 52(2): 310-322.
[10]	LIU Haohan, CHEN Zemao. Study on Malicious Access Detection in Industrial Control Networks Based on Dynamic BayesianGames [J]. Computer Science, 2025, 52(1): 383-392.
[11]	WANG Tianjiu, LIU Quan, WU Lan. Offline Reinforcement Learning Algorithm for Conservative Q-learning Based on Uncertainty Weight [J]. Computer Science, 2024, 51(9): 265-272.
[12]	ZHOU Wenhui, PENG Qinghua, XIE Lei. Study on Adaptive Cloud-Edge Collaborative Scheduling Methods for Multi-object State Perception [J]. Computer Science, 2024, 51(9): 319-330.
[13]	GAO Yuzhao, NIE Yiming. Survey of Multi-agent Deep Reinforcement Learning Based on Value Function Factorization [J]. Computer Science, 2024, 51(6A): 230300170-9.
[14]	LI Danyang, WU Liangji, LIU Hui, JIANG Jingqing. Deep Reinforcement Learning Based Thermal Awareness Energy Consumption OptimizationMethod for Data Centers [J]. Computer Science, 2024, 51(6A): 230500109-8.
[15]	WANG Shuanqi, ZHAO Jianxin, LIU Chi, WU Wei, LIU Zhao. Fuzz Testing Method of Binary Code Based on Deep Reinforcement Learning [J]. Computer Science, 2024, 51(6A): 230800078-7.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Online Parallel SDN Routing Optimization Algorithm Based on Deep Reinforcement Learning

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0