计算机科学 ›› 2021, Vol. 48 ›› Issue (12): 324-330.doi: 10.11896/jsjkx.201100159

• 人工智能 • 上一篇    下一篇

多域SFC部署中基于强化学习的多目标优化方法

王珂1, 曲桦1,2, 赵季红2,3   

  1. 1 西安交通大学软件学院 西安710049
    2 西安交通大学电子与信息工程学院 西安710049
    3 西安邮电大学通信与信息工程学院 西安710061
  • 收稿日期:2020-11-23 修回日期:2021-04-30 出版日期:2021-12-15 发布日期:2021-11-26
  • 通讯作者: 曲桦(qh@mail.xjtu.edu.cn)
  • 作者简介:215545230@qq.com
  • 基金资助:
    国家重点研发计划项目(2018YFB1800305)

Multi-objective Optimization Method Based on Reinforcement Learning in Multi-domain SFC Deployment

WANG Ke1, QU Hua1,2, ZHAO Ji-hong2,3   

  1. 1 School of Software Engineering,Xi'an Jiaotong University,Xi'an 710049,China
    2 School of Electronic and Information Engineering,Xi'an Jiaotong University,Xi'an 710049,China
    3 School of Communication and Information Engineering,Xi'an University of Posts & Telecommunications,Xi'an 710061,China
  • Received:2020-11-23 Revised:2021-04-30 Online:2021-12-15 Published:2021-11-26
  • About author:WANG Ke,born in 1992,Ph.D.His main research interests include software-defined network,network function virtualization and service function chain technology.
    QU Hua,born in 1982,Ph.D,professor,is a member of China Computer Federation.His main research interests include mobile internet,network protocol design,control strategies for supporting emerging applications in ubiquitous networks,and radio resource management in 5G radio communications systems.
  • Supported by:
    National Key R & D Program of China(2018YFB1800305).

摘要: 随着网络虚拟化技术的发展,多域网络中的服务功能链部署为服务功能链优化部署问题带来了新的挑战。传统的部署方法通常对单一目标进行优化,不适用于多目标优化问题,且无法对优化目标间权重进行衡量及平衡。因此,为了对大规模服务功能链部署请求下的时延、网络负载均衡性及接受率进行同步优化,提出了一种数据归一化处理方案,并设计了基于强化学习的两步SFC部署算法。该算法以传输时延与负载均衡性为反馈参数,平衡了两者的权重关系,并对其进行了同步优化,同时利用强化学习框架优化了SFC接受率。实验结果表明,所提算法在大规模请求数下,相比时延感知方法时延降低了71.8%,相比多域部署方法接受率提高了4.6%,相比贪心算法平均负载均衡性提高了39.1%,保证了多目标优化效果。

关键词: 多目标优化, 多域, 服务功能链, 强化学习, 数据归一化

Abstract: With the development of network virtualization technology,the deployment of service function chain in multi-domain network brings new challenges to the optimization of service function chain.The traditional deployment method usually optimizes a single target,which is not suitable for multi-objective optimization,and cannot measure and balance the weight among optimization targets.Therefore,in order to optimize the delay,network load balancing and acceptance rate of large-scale service function chain deployment requests synchronously,a data normalization processing scheme is proposed,and a two-step SFC deployment algorithm based on reinforcement learning is designed.The algorithm takes transmission delay and load balancing as feedback parameters and balances the weight relationship between them,and the SFC acceptance rate is optimized by using reinforcement learning framework simultaneously.The experimental results show that,the delay of the algorithm is reduced by 71.8% compared with LASP method,the acceptance rate is increased by 4.6% compared with MDSP method,and the average load balancing is increased by 39.1% compared with GREEDY method under the large-scale requests.The multi-objective optimization effect is guaranteed.

Key words: Data normalization, Multi-domain, Multi-objective optimization, Reinforcement learning, Service function chain

中图分类号: 

  • TP393
[1]YI B,WANG X W,LIK Q,et al.A comprehensive survey of Network Function Virtualization[J].Computer Networks,2018,133:212-262.
[2]JOSHI K,BENSON T.Network Function Virtualization[J]. IEEE Internet Computing,2016,20(6):7-9.
[3]HALPERN J,PIGNATARO C.Service Function Chaining Ar- chitecture,document RFC 7665 of the IETF Service Function Chaining Working Group[EB/OL].http://datatracker.ietf.org/doc/rfc7665/.
[4]LI Y,CHEN M.Software-defined network function virtualization:a survey[J].IEEE Access,2015,3:2542-2553.
[5]BERNINI G,GIARDINA P G,SPADARO S,et al.Multi-Do- main Orchestration of 5G Vertical Services and Network Slices[C]//2020 IEEE International Conference On Communications Workshops.Dublin,Ireland,2020:6.
[6]WIBOWO F X A,GREGORY M A,AHMED K,et al.Multi-domain Software Defined Networking:Research status and challenges[J].Journal of Network and Computer Applications,2017,87:32-45.
[7]CHEN W H,YIN X,WANG Z L,et al.Placement and Routing Optimization Problem for Service Function Chain:State of Art and Future Opportunities[J].arXiv:1910.02613.
[8]QU L,ASSI C,SHABAN K.Delay-Aware Scheduling and Resource Optimization With Network Function Virtualization[J].IEEE Transactions on Communications,2016,64(9):3746-3758.
[9]ALAMEDDINE H A,QU L,ASSI C.Scheduling Service Function Chains for Ultra-Low Latency Network Services[C]//13th International Conference on Network and Service Management.Tokyo,Japan,2017:9.
[10]SUN G,LI Y Y,LI Y,et al.Low-latency orchestration for workflow-oriented service function chain in edge computing[J].Future Generation Computer Systems-the International Journal of Science,2018,85:116-128.
[11]GOUAREB R,FRIDERIKOS V,AGHVAMI A H.Virtual Network Functions Routing and Placement for Edge Cloud Latency Minimization[J].IEEE Journal on Selected Areas in Communications,2018,36(10):2346-2357.
[12]YE Q,ZHUANG W H,LI X,et al.End-to-End Delay Modeling for Embedded VNF Chains in 5G Core Networks[J].IEEE Internet of Things Journal,2019,6(1):692-704.
[13]MIJUMBI R,SERRAT J,GORRICHO J L,et al.Design and evaluation of algorithms for mapping and scheduling of virtual network functions[C]//2015 1st IEEE Conference on Network Softwarization.London,UK,2015:9.
[14]ALLEG A,AHMED T,MOSBAH M,et al.Delay-aware VNF placement and chaining based on a flexible resource allocation approach[C]//2017 13th International Conference on Network and Service Management.Tokyo,Japan,2017:7.
[15]SHI Z,WU Z H,ZENG Y.A Method of Service Function Chain Arrangement for Load Balancing[C]//9th International Confe-rence on Computer Engineering and Networks.Changsha,China,2019:35-42.
[16]HAN H Y,MENG X R,YU Z H,et al.A Service Function Chain Deployment Method Based on Network Flow Theory for
Load Balance in Operator Networks[J].IEEE Access,2020,8:93187-93199.
[17]XIANG Y F,WU M,WU J,et al.A Load Balancing Method of Virtualization Service Function Chain Based on Time-varying Graphs Integration[J].Journal of Fujian Normal University(Natural Science Edition),2018,34(3):14-20.
[18]SUN G,LI Y,LIAO D,et al.Service Function Chain Orchestration Across Multiple Domains:A Full Mesh Aggregation Approach[J].IEEE Transactions on Network and Service Management,2018,15(3):1175-1191.
[19]XU Q,GAO D Y,LI TX,et al.Low Latency Security Function Chain Embedding Across Multiple Domains[J].IEEE Access,2018,6:14474-14484.
[20]LI G L,ZHOU H C,FENG B H,et al.Context-Aware Service Function Chaining and Its Cost-Effective Orchestration in Multi-Domain Networks[J].IEEE Access,2018,6:34976-34991.
[21]DIETRICH D,ABUJODA A,RIZK A,et al.Multi-Provider Service Chain Embedding With Nestor[J].IEEE Transactions on Network And Service Management,2017,14(1):91-105.
[22]ABUJODA A,PAPADIMITRIOU P.DistNSE:Distributed Network Service Embedding Across Multiple Providers[C]//8th International Conference on Communication Systems And Networks.Bangalore,India,2016:8.
[23]ZHANG C,WANG X W,LI F W,et al.Network Service Chains Deployment Across Multiple SDN Domains[J].International Journal of Communication Systems,2018,31(18):e3826.1-e3826.25.
[24]KAUR K,GARG S,KADDOUM G,et al.An Energy-driven Network Function Virtualization for Multi-domain Software Defined Networks[C]//IEEE Conference on Computer Communications.Paris,France,2019:121-126.
[25]ZHU G H,LI Q,LIANG S L.Cross-domain mapping algorithm of service function chain based on deep reinforcement learning[J].Application Research of Computers,2021,38(6):1834-1837,1842.
[1] 刘兴光, 周力, 刘琰, 张晓瀛, 谭翔, 魏急波.
基于边缘智能的频谱地图构建与分发方法
Construction and Distribution Method of REM Based on Edge Intelligence
计算机科学, 2022, 49(9): 236-241. https://doi.org/10.11896/jsjkx.220400148
[2] 熊丽琴, 曹雷, 赖俊, 陈希亮.
基于值分解的多智能体深度强化学习综述
Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization
计算机科学, 2022, 49(9): 172-182. https://doi.org/10.11896/jsjkx.210800112
[3] 史殿习, 赵琛然, 张耀文, 杨绍武, 张拥军.
基于多智能体强化学习的端到端合作的自适应奖励方法
Adaptive Reward Method for End-to-End Cooperation Based on Multi-agent Reinforcement Learning
计算机科学, 2022, 49(8): 247-256. https://doi.org/10.11896/jsjkx.210700100
[4] 袁唯淋, 罗俊仁, 陆丽娜, 陈佳星, 张万鹏, 陈璟.
智能博弈对抗方法:博弈论与强化学习综合视角对比分析
Methods in Adversarial Intelligent Game:A Holistic Comparative Analysis from Perspective of Game Theory and Reinforcement Learning
计算机科学, 2022, 49(8): 191-204. https://doi.org/10.11896/jsjkx.220200174
[5] 于滨, 李学华, 潘春雨, 李娜.
基于深度强化学习的边云协同资源分配算法
Edge-Cloud Collaborative Resource Allocation Algorithm Based on Deep Reinforcement Learning
计算机科学, 2022, 49(7): 248-253. https://doi.org/10.11896/jsjkx.210400219
[6] 李梦菲, 毛莺池, 屠子健, 王瑄, 徐淑芳.
基于深度确定性策略梯度的服务器可靠性任务卸载策略
Server-reliability Task Offloading Strategy Based on Deep Deterministic Policy Gradient
计算机科学, 2022, 49(7): 271-279. https://doi.org/10.11896/jsjkx.210600040
[7] 孙刚, 伍江江, 陈浩, 李军, 徐仕远.
一种基于切比雪夫距离的隐式偏好多目标进化算法
Hidden Preference-based Multi-objective Evolutionary Algorithm Based on Chebyshev Distance
计算机科学, 2022, 49(6): 297-304. https://doi.org/10.11896/jsjkx.210500095
[8] 郭雨欣, 陈秀宏.
融合BERT词嵌入表示和主题信息增强的自动摘要模型
Automatic Summarization Model Combining BERT Word Embedding Representation and Topic Information Enhancement
计算机科学, 2022, 49(6): 313-318. https://doi.org/10.11896/jsjkx.210400101
[9] 范静宇, 刘全.
基于随机加权三重Q学习的异策略最大熵强化学习算法
Off-policy Maximum Entropy Deep Reinforcement Learning Algorithm Based on RandomlyWeighted Triple Q -Learning
计算机科学, 2022, 49(6): 335-341. https://doi.org/10.11896/jsjkx.210300081
[10] 谢万城, 李斌, 代玥玥.
空中智能反射面辅助边缘计算中基于PPO的任务卸载方案
PPO Based Task Offloading Scheme in Aerial Reconfigurable Intelligent Surface-assisted Edge Computing
计算机科学, 2022, 49(6): 3-11. https://doi.org/10.11896/jsjkx.220100249
[11] 洪志理, 赖俊, 曹雷, 陈希亮, 徐志雄.
基于遗憾探索的竞争网络强化学习智能推荐方法研究
Study on Intelligent Recommendation Method of Dueling Network Reinforcement Learning Based on Regret Exploration
计算机科学, 2022, 49(6): 149-157. https://doi.org/10.11896/jsjkx.210600226
[12] 张佳能, 李辉, 吴昊霖, 王壮.
一种平衡探索和利用的优先经验回放方法
Exploration and Exploitation Balanced Experience Replay
计算机科学, 2022, 49(5): 179-185. https://doi.org/10.11896/jsjkx.210300084
[13] 李浩东, 胡洁, 范勤勤.
基于并行分区搜索的多模态多目标优化及其应用
Multimodal Multi-objective Optimization Based on Parallel Zoning Search and Its Application
计算机科学, 2022, 49(5): 212-220. https://doi.org/10.11896/jsjkx.210300019
[14] 李鹏, 易修文, 齐德康, 段哲文, 李天瑞.
一种基于深度学习的供热策略优化方法
Heating Strategy Optimization Method Based on Deep Learning
计算机科学, 2022, 49(4): 263-268. https://doi.org/10.11896/jsjkx.210300155
[15] 彭冬阳, 王睿, 胡谷雨, 祖家琛, 王田丰.
视频缓存策略中QoE和能量效率的公平联合优化
Fair Joint Optimization of QoE and Energy Efficiency in Caching Strategy for Videos
计算机科学, 2022, 49(4): 312-320. https://doi.org/10.11896/jsjkx.210800027
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!