计算机科学 ›› 2022, Vol. 49 ›› Issue (7): 248-253.doi: 10.11896/jsjkx.210400219

• 计算机网络 • 上一篇    下一篇

基于深度强化学习的边云协同资源分配算法

于滨1, 李学华1, 潘春雨1, 李娜2   

  1. 1 北京信息科技大学信息与通信工程学院 北京100192
    2 北京信息科技大学佰才邦技术智慧物联联合实验室 北京100094
  • 收稿日期:2021-04-21 修回日期:2022-03-08 出版日期:2022-07-15 发布日期:2022-07-12
  • 通讯作者: 李学华(lixuehua@bistu.edu.cn)
  • 作者简介:(2965165683@qq.com)
  • 基金资助:
    北京市自然科学基金-市教委联合资助项目(KZ201911232046);北京市自然科学基金-海淀原始创新联合基金(L192022,L182039)

Edge-Cloud Collaborative Resource Allocation Algorithm Based on Deep Reinforcement Learning

YU Bin1, LI Xue-hua1, PAN Chun-yu1, LI Na2   

  1. 1 School of Information and Telecommunication Engineering,Beijing Information Science & Technology University,Beijing 100192,China
    2 Baicells Joint Laboratory of Intelligent and IoT,Beijing Information Science and Technology University,Beijing 100094,China
  • Received:2021-04-21 Revised:2022-03-08 Online:2022-07-15 Published:2022-07-12
  • About author:YU Bin,born in 1996,postgraduate.His main research interests include mobile edge computing and resource allocation.
    LI Xue-hua,born in 1977,Ph.D,professor.Her main research interests include wireless communication technologies,internet of things technologies and smart edge computing.
  • Supported by:
    Natural Science Foundation of Beijing with Municipal Education Commission Joint Fund(KZ201911232046) and Natural Science Foundation of Beijing with Haidian Original Innovation Joint Fund(L192022,L182039).

摘要: 移动边缘计算(Mobile Edge Computing,MEC)用于增强低功耗网络的数据处理能力,目前已成为一种高效的计算范例。文中考虑了由多个终端(Mobile Terminal,MT)组成的边云协同系统及其资源分配策略。为降低MTs的时延总和,采用多种卸载模式,提出了基于深度强化学习的任务卸载算法,该算法将深度神经网络(Deep Neural Network,DNN)作为一个可伸缩的解决方案来实现,从经验中学习多进制卸载模式来最小化时延总和。仿真结果表明,与深度Q网络(Deep Q Network,DQN)算法及深度确定性策略梯度(Deep Deterministic Policy Gradient,DDPG)算法相比,所提算法在最大性能增益上提升显著。此外,从仿真结果中可以看出,所提算法具有较好的收敛性,该算法的结果接近穷举搜索得到的最优解。

关键词: 深度强化学习, 移动边缘计算, 移动终端, 资源分配

Abstract: Mobile Edge Computing(MEC) is used to enhance data processing in low power networks,and it has become an efficient computing paradigm.This paper considers an edge-cloud collaborative system composed of multiple MTs and adopts a variety of offloading modes.In order to reduce the total time delay of MTs,a task offloading algorithm based on deep reinforcement learning is proposed.It implements deep neural network(DNN) as a scalable solution,learns the multi-base offloading mode from experience to minimize the total time delay.Simulation results indicate that compared with the deep Q network(DQN) algorithm and the deep deterministic policy gradient(DDPG) algorithm,the proposed algorithm can improve the maximum performance gain significantly.In addition,the proposed algorithm has good convergence,and its result can approach the optimal result obtained by exhaustive search.

Key words: Deep reinforcement learning, Mobile edge computing, Mobile terminal, Resource allocation

中图分类号: 

  • TP181
[1]WANG K Z,YANG K,MAGURAWALAGE C S.Joint Energy Minimization and Resource Allocation in C-RAN with Mobile Cloud[J].IEEE Transactions on Cloud Computing,2018,6(3):760-770.
[2]WANG S,ZHANG X,ZHANG Y,et al.A Survey on MobileEdge Networks:Convergence of Computing,Caching and Communications[J].IEEE Access,2017,5:6757-6779.
[3]ABBAS N,ZHANG Y,TAHERKORDI A,et al.Mobile Edge Computing:A Survey[J].IEEE Internet of Things Journal,2017,5(1):450-465.
[4]WANG S H,PAN C Y,YIN C C.Joint Heterogeneous Tasks Offloading and Resource Allocation in Mobile Edge Computing Systems[C]//2018 10th International Conference on Wireless Communications and Signal Processing(WCSP).2018.
[5]ZHANG J,WU M Q,ZHAO M.Joint Computation Offloading and Resource Allocation in C-RAN With MEC Based on Spectrum Efficiency[J].IEEE Access,2019,7:79056-79068.
[6]YU W J,MUSAVIAN L,QUDDUS A U,et al.Low LatencyDriven Effective Capacity Analysis for Non-Orthogonal and Orthogonal Spectrum Access[C]//2018 IEEE Globecom Workshops(GC Wkshps).IEEE,2019.
[7]WU Y,NI K,ZHANG C,et al.NOMA Assisted Multi-Access Mobile Edge Computing:A Joint Optimization of Computation Offloading and Time Allocation[J].IEEE Transactions on Vehicular Technology,2018,67(12):12244-12258.
[8]ZHAI D S,ZHANG R N,CAI L,et al.Energy-efficient user scheduling and power allocation for NOMA-based wireless networks with massive IoT devices[J].IEEE Internet of Things Journal,2018,5(3):1857-1868.
[9]LIU Q,ZHAI J W,ZHANG Z C,et al.A Survey on Deep Reinforcement Learning[J].Chinese Journal of Computers,2018,41(1):1-27.
[10]LIANG J B,ZHANG H H,JIANG C,et al.Research Progress of Task Offloading Based on Deep Reinforcement Learning in Mobile Edge Computing [J].Computer Science,2021,48(7):316-323.
[11]YANG Z,LIU Y W,CHEN Y,et al.Deep Reinforcement Lear-ning in Cache-Aided MEC Networks[C]//IEEE International Conference on Communications.2019,3:20-24.
[12]HE X M,WANG K,HUANG H W,et al.Green resource allocation based on deep reinforcement learning in content-centric IoT[J].IEEE Transactions on Emerging Topics in Computing,2018,10(3):1-16.
[13]LIU N,LI Z,XU J L,et al.A hierarchical framework of cloud resource allocation and power management using deep reinforcement learning[C]//IEEE 37th International Conference on Distributed Computing Systems(ICDCS).IEEE,2017:372-382.
[14]CHEN Z,WANG X D.Decentralized Computation Offloading for Multi-User Mobile Edge Computing:A Deep Reinforcement Learning Approach[J].arXiv:1812.07394,2018.
[15]ALFAKIH T,HASSAN M M,GUMAEI A,et al.Task Off-loading and Resource Allocation for Mobile Edge Computing by Deep Reinforcement Learning Based on SARSA[J].IEEE Access,2020,8:54074-54084.
[16]LI J,GAO H,LV T J,et al.Deep Reinforcement Learning based Computation Offloading and Resource Allocation for MEC[J].IEEE Wireless Communications and Networking Conference,2018:1-6.
[17]TANG M,WONG W S V.Deep Reinforcement Learning Based Task Offloading Algorithm for Mobile-edge Computing Systems[J].arXiv:2005.02459,2020.
[18]YAN J,BI S Z,ZHANG A J Y.Offloading and Resource Allocation With General Task Graph in Mobile Edge Computing:A Deep Reinforcement Learning Approach[J].IEEE Transactions on Wireless Communications,2020,19(8):5404-5419.
[19]CANEDO J,SKJELLUM A.Using machine learning to secure IoT systems[C]//2016 14th Annual Conference on Privacy,Security and Trust(PST).IEEE,2016:219-222.
[20]TORRES P,CATANIA C,GARCIA S,et al.An analysis of Recurrent Neural Networks for Botnet detection behavior[C]//2016 IEEE Biennial Congress of Argentina(ARGENCON).IEEE,2016:1-6.
[21]TIAN B,ZHANG Q,XIN X J,et al.Recursive Neural Network Based RRH to BBU Resource Allocation in 5G Fronthaul Network[C]//2018 Asia Communications and Photonics Confe-rence(ACP).IEEE,2018:1-3.
[22]FENG J,YU F R,PEI Q Q,et al.Cooperative Computation Offloading and Resource Allocation for Blockchain-Enabled Mobile-Edge Computing:A Deep Reinforcement Learning Approach[J].IEEE Internet of Things Journal,2020,7(7):6214-6228.
[23]WANG D,QIN H,SONG B,et al.Resource Allocation in Information-Centric Wireless Networking with D2D-Enabled MEC:A Deep Reinforcement Learning Approach[J].IEEE Access,2019,7:114935-114944.
[24]HUANG B B,LI Z J,XU Y Q,et al.Deep Reinforcement Lear-ning for Performance-Aware Adaptive Resource Allocation in Mobile Edge Computing[J].Wireless Communications and Mobile Computing,2020(2020):1-17.
[25]XIONG X,ZHENG K,LEI L,et al.Resource Allocation Based on Deep Reinforcement Learning in IoT Edge Computing[J].2020 IEEE Journal on Selected Areas in Communications,2020,38(6):1133-1146.
[26]FAN Y F,YUAN S,CAI Y,et al.Deep Reinforcement Lear-ning-based Collaborative Computation Offloading Scheme in Vehicular Edge Computing[J].Computer Science,2021,48(5):270-276.
[27]MA Y Y,ZHENG W B,MA Y,et al.Multi-workflow Offloading Method Based on Deep Reinforcement Learning and Probabilistic Performance-aware in Edge Computing Environment [J].Computer Science,2021,48(1):40-48.
[28]YANG T,ZHANG H L,JI H,et al.Computation collaboration in ultra dense network integrated with mobile edge computing[C]//IEEE 28th Annual International Symposium on Personal,Indoor,and Mobile Radio Communications.2017:1-5.
[29]3GPP TR 36.885 V2.0.0-2016,3rd Generation PartnershipProject:Technical Specification Group Radio Access Network:Study LTE-Based V2X Services(Release 14) [S]:2016.
[1] 熊丽琴, 曹雷, 赖俊, 陈希亮.
基于值分解的多智能体深度强化学习综述
Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization
计算机科学, 2022, 49(9): 172-182. https://doi.org/10.11896/jsjkx.210800112
[2] 唐枫, 冯翔, 虞慧群.
基于自适应知识迁移与资源分配的多任务协同优化算法
Multi-task Cooperative Optimization Algorithm Based on Adaptive Knowledge Transfer andResource Allocation
计算机科学, 2022, 49(7): 254-262. https://doi.org/10.11896/jsjkx.210600184
[3] 李梦菲, 毛莺池, 屠子健, 王瑄, 徐淑芳.
基于深度确定性策略梯度的服务器可靠性任务卸载策略
Server-reliability Task Offloading Strategy Based on Deep Deterministic Policy Gradient
计算机科学, 2022, 49(7): 271-279. https://doi.org/10.11896/jsjkx.210600040
[4] 方韬, 杨旸, 陈佳馨.
D2D辅助移动边缘计算下的卸载策略优化
Optimization of Offloading Decisions in D2D-assisted MEC Networks
计算机科学, 2022, 49(6A): 601-605. https://doi.org/10.11896/jsjkx.210200114
[5] 刘漳辉, 郑鸿强, 张建山, 陈哲毅.
多无人机使能移动边缘计算系统中的计算卸载与部署优化
Computation Offloading and Deployment Optimization in Multi-UAV-Enabled Mobile Edge Computing Systems
计算机科学, 2022, 49(6A): 619-627. https://doi.org/10.11896/jsjkx.210600165
[6] 姚烨, 朱怡安, 钱亮, 贾耀, 张黎翔, 刘瑞亮.
一种基于异质模型融合的 Android 终端恶意软件检测方法
Android Malware Detection Method Based on Heterogeneous Model Fusion
计算机科学, 2022, 49(6A): 508-515. https://doi.org/10.11896/jsjkx.210700103
[7] 谢万城, 李斌, 代玥玥.
空中智能反射面辅助边缘计算中基于PPO的任务卸载方案
PPO Based Task Offloading Scheme in Aerial Reconfigurable Intelligent Surface-assisted Edge Computing
计算机科学, 2022, 49(6): 3-11. https://doi.org/10.11896/jsjkx.220100249
[8] 周天清, 岳亚莉.
超密集物联网络中多任务多步计算卸载算法研究
Multi-Task and Multi-Step Computation Offloading in Ultra-dense IoT Networks
计算机科学, 2022, 49(6): 12-18. https://doi.org/10.11896/jsjkx.211200147
[9] 邱旭, 卞浩卜, 吴铭骁, 朱晓荣.
基于5G毫米波通信的高速公路车联网任务卸载算法研究
Study on Task Offloading Algorithm for Internet of Vehicles on Highway Based on 5G MillimeterWave Communication
计算机科学, 2022, 49(6): 25-31. https://doi.org/10.11896/jsjkx.211100198
[10] 胥昊, 曹桂均, 闫璐, 李科, 王振宏.
面向铁路集装箱的高可靠低时延无线资源分配算法
Wireless Resource Allocation Algorithm with High Reliability and Low Delay for Railway Container
计算机科学, 2022, 49(6): 39-43. https://doi.org/10.11896/jsjkx.211200143
[11] 洪志理, 赖俊, 曹雷, 陈希亮, 徐志雄.
基于遗憾探索的竞争网络强化学习智能推荐方法研究
Study on Intelligent Recommendation Method of Dueling Network Reinforcement Learning Based on Regret Exploration
计算机科学, 2022, 49(6): 149-157. https://doi.org/10.11896/jsjkx.210600226
[12] 沈家芳, 钱丽萍, 杨超.
面向集能型中继窄带物联网的非正交多址接入和多维网络资源优化
Non-orthogonal Multiple Access and Multi-dimension Resource Optimization in EH Relay NB-IoT Networks
计算机科学, 2022, 49(5): 279-286. https://doi.org/10.11896/jsjkx.210400239
[13] 李鹏, 易修文, 齐德康, 段哲文, 李天瑞.
一种基于深度学习的供热策略优化方法
Heating Strategy Optimization Method Based on Deep Learning
计算机科学, 2022, 49(4): 263-268. https://doi.org/10.11896/jsjkx.210300155
[14] 彭冬阳, 王睿, 胡谷雨, 祖家琛, 王田丰.
视频缓存策略中QoE和能量效率的公平联合优化
Fair Joint Optimization of QoE and Energy Efficiency in Caching Strategy for Videos
计算机科学, 2022, 49(4): 312-320. https://doi.org/10.11896/jsjkx.210800027
[15] 欧阳卓, 周思源, 吕勇, 谭国平, 张悦, 项亮亮.
基于深度强化学习的无信号灯交叉路口车辆控制
DRL-based Vehicle Control Strategy for Signal-free Intersections
计算机科学, 2022, 49(3): 46-51. https://doi.org/10.11896/jsjkx.210700010
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!