计算机科学 ›› 2021, Vol. 48 ›› Issue (7): 316-323.doi: 10.11896/jsjkx.200800095

• 计算机网络 • 上一篇    下一篇

移动边缘计算中基于深度强化学习的任务卸载研究进展

梁俊斌1,2, 张海涵1,2, 蒋婵3, 王天舒4   

  1. 1 广西大学计算机与电子信息学院 南宁530004
    2 广西多媒体通信与网络技术重点实验室 南宁530004
    3 广西大学行健文理学院 南宁530004
    4 东软集团(南宁)有限公司 南宁530007
  • 收稿日期:2020-08-16 修回日期:2020-12-02 出版日期:2021-07-15 发布日期:2021-07-02
  • 通讯作者: 张海涵(1146795832@qq.com)
  • 基金资助:
    国家自然科学基金(61562005);广西重点研发计划项目(桂科AB19259006);广西自然科学基金(2019GXNSFAA185042,2018GXNSFBA281169)

Research Progress of Task Offloading Based on Deep Reinforcement Learning in Mobile Edge Computing

LIANG Jun-bin1,2, ZHANG Hai-han1,2, JIANG Chan3, WANG Tian-shu4   

  1. 1 School of Computer,Electronics and Information,Guangxi University,Nanning 530004,China
    2 Guangxi Key Laboratory of Multimedia Communication and Network Technology,Nanning 530004,China
    3 XingJian College of Science and Liberal Arts of Guangxi University,Nanning 530004,China
    4 Neusoft Group (Nanning) Co.,Ltd,Nanning 530007,China
  • Received:2020-08-16 Revised:2020-12-02 Online:2021-07-15 Published:2021-07-02
  • About author:LIANG Jun-bin,born in 1979,Ph.D,professor,Ph.D supervisor.His main research interests include wireless sensor networks,network deployment and optimization.(liangjb2002@163.com)
    ZHANG Hai-han,born in 1993,postgraduate.His main research include wireless sensor networks and artificial intelligence.
  • Supported by:
    National Natural Science Foundation of China(61562005),Guangxi Key Research and Development Plan Project(AB19259006) and Natural Science Foundation of Guangxi(2019GXNSFAA185042,2018GXNSFBA281169).

摘要: 移动边缘计算是近年出现的一种新型网络计算模式,它允许将具有较强计算能力和存储性能的服务器节点放置在更加靠近移动设备的网络边缘(如基站附近),让移动设备可以近距离地卸载任务到边缘设备进行处理,从而解决了传统网络由于移动设备的计算和存储能力弱且能量较有限,从而不得不耗费大量时间、能量且不安全地将任务卸载到远方的云平台进行处理的弊端。但是,如何让仅掌握局部有限信息(如邻居数量)的设备根据任务的大小和数量选择卸载任务到本地,还是在无线信道随时间变化的动态网络中选择延迟、能耗均最优的移动边缘计算服务器进行全部或部分的任务卸载,是一个多目标规划问题,求解难度较高。传统的优化技术(如凸优化等)很难获得较好的结果。而深度强化学习是一种将深度学习与强化学习相结合的新型人工智能算法技术,能够对复杂的协作、博弈等问题作出更准确的决策,在工业、农业、商业等多个领域具有广阔的应用前景。近年来,利用深度强化学习来优化移动边缘计算网络中的任务卸载成为一种新的研究趋势。最近三年来,一些研究者对其进行了初步的探索,并达到了比以往单独使用深度学习或强化学习更低的延迟和能耗,但是仍存在很多不足之处。为了进一步推进该领域的研究,文中对近年来国内外的相关工作进行了详细地分析、对比和总结,归纳了它们的优缺点,并对未来可能深入研究的方向进行了讨论。

关键词: 移动边缘计算, 深度强化学习, 任务卸载, 卸载决策, 深度学习, 强化学习

Abstract: Mobile edge computing is a new type of network computing mode that has emerged in recent years.It allows server nodes with strong computing power and storage performance to be placed closer to the edge of the network of mobile devices (such as near base stations ),allowing mobile devices to offload tasks to edge devices for processing closely,thereby alleviates the disadvantages of traditional networks that have to spend a lot of time,energy and unsafely offload tasks to remote cloud platforms for processing due to weak computing and storage capabilities of mobile devices and limited energy.However,how to make a device that only has limited local information (such as the number of neighbors ) chooses to offload tasks to the local site according to the size and number of tasks,or chooses the mobile edge computing server with the optimal delay and energy consumption in a dynamic network where the wireless channel changes with time,to perform all or part of the task offloading,is a multi-objective programming problem and has a high degree of difficulty in solving.It is difficult to obtain better results with traditional optimization techniques(such as convex optimization).Deep reinforcement learning is a new type of artificial intelligence algorithm technology that combines deep learning and reinforcement learning.It can make more accurate decision-making results for complex collaboration,game and other issues.It has broad application prospects in many fields such as industry,agriculture and commerce.In recent years,It has become a new research trend to use deep reinforcement learning method to optimize task offloading in mobile edge computing networks.In the past three years,some researchers have conducted preliminary explorations on it,and achieved lower latency and energy consumption than using deep learning or reinforcement learning alone in the past,but there are still many shortcomings.In order to further advance the research in this field,this paper analyzes,compares and summarizes the domestic and foreign related work in recent years,summarizes their advantages and disadvantages,and discusses the possible future in research directions.

Key words: Mobile edge computing, Deep reinforcement learning, Task offloading, Offloading decision, Deep learning, Reinforcement learning

中图分类号: 

  • TP393
[1]YOU C,HUANG K,CHAE H,et al.Energy-Efficient Resource Allocation for Mobile-Edge Computation Offloading[J].IEEE Transactions on Wireless Communications,2017,16(3):1397-1411.
[2]JEONG S,SIMEONE O,KANG J,et al.Mobile Edge Computing via a UAV-Mounted Cloudlet:Optimization of Bit Allocation and Path Planning[J].IEEE Transactions on Vehicular Technology,2018,67(3):2049-2063.
[3]SARDELLITTI S,SCUTARI G,BARBAROSSA S,et al.Joint Optimization of Radio and Computational Resources for Multicell Mobile-Edge Computing[J].IEEE Transactions on Signal and Information Processing Over Networks,2015,1(2):89-103.
[4]CHEN Y,ZHANG N,ZHANG Y,et al.TOFFEE:Task Off-loading and Frequency Scaling for Energy Efficiency of Mobile Devices in Mobile Edge Computing[J].IEEE Transactions on Cloud Computing,2019(99):1-1.
[5]REN J,YU G,CAI Y,et al.Latency Optimization for Resource Allocation in Mobile-Edge Computation Offloading[J].IEEE Transactions on Wireless Communications,2018,17(8):5506-5519.
[6]TALEB T,SAMDANIS K,MADA B,et al.On Multi-Access Edge Computing:A Survey of the Emerging 5G Network Edge Cloud Architecture and Orchestration[J].IEEE Communications Surveys and Tutorials,2017,19(3):1657-1681.
[7]TRAN T X,HAJISAMI A,PANDEY P,et al.CollaborativeMobile Edge Computing in 5G Networks:New Paradigms,Scenarios,and Challenges[J].IEEE Communications Magazine,2017,55(4):54-61.
[8]PAPADIMITRIOU C H,TSITSIKLIS J N.The complexity of Markov decision processes[J].Mathematics of Operations Research,1987,12(3):441-450.
[9]CHEN Y,ZHANG N,ZHANG Y,et al.Energy efficient dynamic offloading in mobile edge computing for internet of things[J/OL].IEEE Transactions on Cloud Computing,2019.http://www.semanticscholar.org/paper/Energy-Efficient-Dynamic-Offloading-in-Mobile-Edge-Chen-Zhang/fbe4cb7777cdd2485d1e5fb0072c896b045027fc.
[10]MNIH V,KAVUKCUOGLU K,SILVER D,et al.Human-level control through deep reinforcement learning[J].Nature,2015,518(7540):529-533.
[11]ARULKUMARAN K,DEISENROTH M P,BRUNDAGE M,et al.Deep Reinforcement Learning:A Brief Survey[J].IEEE Signal Processing Magazine,2017,34(6):26-38.
[12]HE Y,ZHAO N,YIN H,et al.Integrated Networking,Cac-hing,and Computing for Connected Vehicles:A Deep Reinforcement Learning Approach[J].IEEE Transactions on Vehicular Technology,2018,67(1):44-55.
[13]WANG C,LIANG C,YU F R,et al.Computation Offloadingand Resource Allocation in Wireless Cellular Networks With Mobile Edge Computing[J].IEEE Transactions on Wireless Communications,2017,16(8):4924-4938.
[14]ZHOU Y,YU F R,CHEN J,et al.Resource Allocation for Information-Centric Virtualized Heterogeneous Networks With In-Network Caching and Mobile Edge Computing[J].IEEE Transactions on Vehicular Technology,2017,66(12):11339-11351.
[15]YOU C,HUANG K,CHAE H,et al.Energy-Efficient Resource Allocation for Mobile-Edge Computation Offloading[J].IEEE Transactions on Wireless Communications,2017,16(3):1397-1411.
[16]LYU X,TIAN H,NI W,et al.Energy-Efficient Admission of Delay-Sensitive Tasks for Mobile Edge Computing[J].IEEE Transactions on Communications,2018,66(6):2603-2616.
[17]ZHAO P,TIAN H,FAN S,et al.Information Prediction andDynamic Programming-Based RAN Slicing for Mobile Edge Computing[J].IEEE Wireless Communications Letters,2018,7(4):614-617.
[18]LI J,GAO H,LYU T,et al.Deep reinforcement learning based computation offloading and resource allocation for MEC[C]//Wireless Communications and Networking Conference.2018:1-6.
[19]HUANG L,BI S,ZHANG Y A,et al.Deep ReinforcementLearning for Online Computation Offloading in Wireless Powered Mobile-Edge Computing Networks[J/OL].http://arXiv.org/abs/1808.01977v6.
[20]CHEN X,ZHANG H,WU C,et al.Optimized Computation Offloading Performance in Virtual Edge Computing Systems Via Deep Reinforcement Learning[J].IEEE Internet of Things Journal,2019,6(3):4005-4018.
[21]HUANG L,FENG X,ZHANG C,et al.Deep reinforcementlearning-based joint task offloading and bandwidth allocation for multi-user mobile edge computing[J].Digital Communications and Networks,2019,5(1):10-17.
[22]YAO P,CHEN X,CHEN Y,et al.Deep reinforcement learning based offloading scheme for mobile edge computing[C]//2019 IEEE International Conference on Smart Internet of Things (SmartIoT).IEEE,2019:417-421.
[23]HE Y,YU F R,ZHAO N,et al.Software-Defined Networkswith Mobile Edge Computing and Caching for Smart Cities:A Big Data Deep Reinforcement Learning Approach[J].IEEE Communications Magazine,2017,55(12):31-37.
[24]CHEN M,LIANG B,DONG M,et al.Joint offloading decision and resource allocation for multi-user multi-task mobile cloud[C]//International Conference on Communications.2016:1-6.
[25]VAN HASSELT H,GUEZ A,SILVER D,et al.Deep reinforcement learning with double Q-Learning[C]//National Conference On Artificial Intelligence.2016:2094-2100.
[26]MIN M,XIAO L,CHEN Y,et al.Learning-Based Computation Offloading for IoT Devices With Energy Harvesting[J].IEEE Transactions on Vehicular Technology,2019,68(2):1930-1941.
[27]NING Z,DONG P,WANG X,et al.Deep reinforcement learning for vehicular edge computing: An intelligent offloading system[J].ACM Transactions on Intelligent Systems and Technology (TIST),2019,10(6):1-24.
[28]LU H,GU C,LUO F,et al.Optimization of lightweight task offloading strategy for mobile edge computing based on deep reinforcement learning[J].Future Generation Computer Systems,2020,102:847-861.
[29]PAN S J,YANG Q.A Survey on Transfer Learning[J].IEEETransactions on Knowledge and Data Engineering,2010,22(10):1345-1359.
[30]SRIVASTAVA N,HINTON G E,KRIZHEVSKY A,et al.Dropout:a simple way to prevent neural networks from overfitting[J].Journal of Machine Learning Research,2014,15(1):1929-1958.
[31]HOCHREITER S,SCHMIDHUBER J.Long short-term memory[J].Neural Computation,1997,9(8):1735-1780.
[32]GUPTA H,DASTJERDI A V,GHOSH S K,et al.iFogSim:A toolkit for modeling and simulation of resource management techniques in the Internet of Things,Edge and Fog computing environments[J].Software-Practice and Experience,2017,47(9):1275-1296.
[33]HUANG B,LI Y,LI Z,et al.Security and Cost-Aware Computation Offloading via Deep Reinforcement Learning in Mobile Edge Computing[J].Wireless Communications and Mobile Computing,2019(2019):1-20.
[34]ZHANG K,ZHU Y,LENG S,et al.Deep Learning Empowered Task Offloading for Mobile Edge Computing in Urban Infor-matics[J].IEEE Internet of Things Journal,2019,6(5):7635-7647.
[35]XU X,ZHANG X,GAO H,et al.BeCome:Blockchain-Enabled Computation Offloading for IoT in Mobile Edge Computing[J].IEEE Transactions on Industrial Informatics,2020,16(6):4187-4195.
[36]MAURICE N,PHAM Q V,HWANG W J.Online Computation Offloading in NOMA-based Multi-Access Edge Computing:A Deep Reinforcement Learning Approach[J].IEEE Access,2020(99):1-1.
[37]ALFAKIH T,HASSAN M M,GUMAEI A,et al.Task Off-loading and Resource Allocation for Mobile Edge Computing by Deep Reinforcement Learning Based on SARSA[J].IEEE Access,2020:8:54074-54084.
[38]ZHANG H,WU W,WANG C,et al.Deep ReinforcementLearning-Based Offloading Decision Optimization in Mobile Edge Computing[C]//Wireless Communications and Networking Conference.2019:1-7.
[39]LIU Y,CUI Q,ZHANG J,et al.An Actor-Critic Deep Rein-forcement Learning Based Computation Offloading for Three-Tier Mobile Computing Networks[C]//International Confe-rence on Wireless Communications and Signal Processing.2019:1-6.
[40]MNIH V,BADIA A P,MIRZA M,et al.Asynchronous methods for deep reinforcement learning[C]// International Conference on Machine Learning.2016:1928-1937.
[41]ZHAN W,LUO C,WANG J,et al.Deep Reinforcement Lear-ning-Based Offloading Scheduling for Vehicular Edge Computing[J].IEEE Internet of Things Journal,2020:7(6):5449-5465.
[42]SCHULMAN J,WOLSKI F,DHARIWAL P,et al.ProximalPolicy Optimization Algorithms[J].arXiv:1707.06347,2017.
[43]ZHANG T,CHIANG Y,BORCEA C,et al.Learning-Based Offloading of Tasks with Diverse Delay Sensitivities for Mobile Edge Computing[C]//Global Communications Conference.2019.
[44]FENG J,YU F R,PEI Q,et al.Cooperative Computation Offloading and Resource Allocation for Blockchain-Enabled Mobile Edge Computing:A Deep Reinforcement Learning Approach[J].IEEE Internet of Things Journal,2019:1-1.
[45]XIONG Z,ZHANG Y,NIYATO D,et al.When Mobile Blockchain Meets Edge Computing[J].IEEE Communications Magazine,2018,56(8):33-39.
[46]LILLICRAP T,HUNT J J,PRITZEL A,et al.Continuous control with deep reinforcement learning[C]//InternationalConfe-rence on Learning Representations.2016.
[47]CHEN Z,WANG X D.Decentralized Computation Offloadingfor Multi-User Mobile Edge Computing:A Deep Reinforcement Learning Approach[J].arXiv:1812.07394,2018.
[48]SILVER D,LEVER G,HEESS N,et al.Deterministic PolicyGradient Algorithms[C]//International Conference on Machine Learning.2014:387-395.
[49]QIU X,LIU L,CHEN W,et al.Online Deep ReinforcementLearning for Computation Offloading in Blockchain-Empowered Mobile Edge Computing[J].IEEE Transactions on Vehicular Technology,2019,68(8):8050-8062.
[50]SRINIVAS M,PATNAIK L M.Adaptive probabilities of cros-sover and mutation in genetic algorithms[J].IEEE Transactions on Systems,Man,and Cybernetics,2002,24(4):656-667.
[51]SCHAUL T,QUAN J,ANTONOGLOU I,et al.Prioritized Experience Replay[J/OL].http://arXiv:org/bas/1511.05952v4.
[52]HE X M,LU H D,HUANG H W,et al.QoE-Based Cooperative Task Offloading with Deep Reinforcement Learning in Mobile Edge Networks[J].IEEE Wireless Communications,2020,27(3):111-117.
[53]VAN HUYNH N,HOANG D T,NGUYEN D N,et al.Optimal and Fast Real-Time Resource Slicing With Deep Dueling Neural Networks[J].IEEE Journal on Selected Areas in Communications,2019,37(6):1455-1470.
[54]REN Y L,YU X M,CHEN X Y,et al.Vehicular Network Edge Intelligent Management:A Deep Deterministic Policy Gradient Approach for Service Offloading Decision[C]//IWCMC.2020:905-910.
[1] 王超, 魏祥麟, 田青, 焦翔, 魏楠, 段强. 基于特征梯度的调制识别深度网络对抗攻击方法[J]. 计算机科学, 2021, 48(7): 25-32.
[2] 周仕承, 刘京菊, 钟晓峰, 卢灿举. 基于深度强化学习的智能化渗透测试路径发现[J]. 计算机科学, 2021, 48(7): 40-46.
[3] 李贝贝, 宋佳芮, 杜卿芸, 何俊江. DRL-IDS:基于深度强化学习的工业物联网入侵检测系统[J]. 计算机科学, 2021, 48(7): 47-54.
[4] 羊洋, 陈伟, 张丹懿, 王丹妮, 宋爽. 对抗攻击威胁基于卷积神经网络的网络流量分类[J]. 计算机科学, 2021, 48(7): 55-61.
[5] 暴雨轩, 芦天亮, 杜彦辉, 石达. 基于i_ResNet34模型和数据增强的深度伪造视频检测方法[J]. 计算机科学, 2021, 48(7): 77-85.
[6] 桑春艳, 胥文, 贾朝龙, 文俊浩. 社交网络中基于注意力机制的网络舆情事件演化趋势预测[J]. 计算机科学, 2021, 48(7): 118-123.
[7] 徐浩, 刘岳镭. 基于深度学习的无人机声音识别算法[J]. 计算机科学, 2021, 48(7): 225-232.
[8] 尹云飞, 林跃江, 黄发良, 白翔宇. 基于趋势特征向量的火灾烟气流动与温度分布预测[J]. 计算机科学, 2021, 48(7): 299-307.
[9] 高士顺, 赵海涛, 张晓瀛, 魏急波. 一种自适应于不同场景的智能无线传播模型[J]. 计算机科学, 2021, 48(7): 324-332.
[10] 王英恺, 王青山. 能量收集无线通信系统中基于强化学习的能量分配策略[J]. 计算机科学, 2021, 48(7): 333-339.
[11] 冯芙蓉, 张兆功. 目标轮廓检测技术新进展[J]. 计算机科学, 2021, 48(6A): 1-9.
[12] 崔雯昊, 蒋慕蓉, 杨磊, 傅鹏铭, 朱凌霄. 结合MCycleGAN与RFCNN实现太阳斑点图高分辨重建[J]. 计算机科学, 2021, 48(6A): 38-42.
[13] 和青芳, 王慧, 程光. 自适应小数据集乳腺癌病理组织分类研究[J]. 计算机科学, 2021, 48(6A): 67-73.
[14] 刘汉卿, 康晓东, 李博, 张华丽, 冯继超, 韩俊玲. 利用深度学习网络对医学影像分类识别的比较研究[J]. 计算机科学, 2021, 48(6A): 89-94.
[15] 韩斌, 曾松伟. 基于多特征融合和卷积神经网络的植物叶片识别[J]. 计算机科学, 2021, 48(6A): 113-117.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 刘东, 王叶斐, 林建平, 马海川, 杨闰宇. 端到端优化的图像压缩技术进展[J]. 计算机科学, 2021, 48(3): 1 -8 .
[2] 潘金山. 基于深度学习的图像去模糊方法研究进展[J]. 计算机科学, 2021, 48(3): 9 -13 .
[3] . 目录[J]. 计算机科学, 2021, 48(3): 0 .
[4] . 多媒体技术进展专题前言[J]. 计算机科学, 2021, 48(3): 0 -00 .
[5] 方磊, 武泽慧, 魏强. 二进制代码相似性检测技术综述[J]. 计算机科学, 2021, 48(5): 1 -8 .
[6] 傅天豪, 田鸿运, 金煜阳, 杨章, 翟季冬, 武林平, 徐小文. 一种面向构件化并行应用程序的性能骨架分析方法[J]. 计算机科学, 2021, 48(6): 1 -9 .
[7] 郭彪, 唐麒, 文智敏, 傅娟, 王玲, 魏急波. 一种面向动态部分可重构片上系统的列表式软硬件划分算法[J]. 计算机科学, 2021, 48(6): 19 -25 .
[8] 刘聃, 郭绍忠, 郝江伟, 许瑾晨. 基于SIMD扩展部件的长向量超越函数实现方法[J]. 计算机科学, 2021, 48(6): 26 -33 .
[9] 孙正, 张小雪. 生物光声成像中声反射伪影抑制方法的研究进展[J]. 计算机科学, 2021, 48(6A): 10 -14 .
[10] 周欣, 刘硕迪, 潘薇, 陈媛媛. 自然交通场景中的车辆颜色识别[J]. 计算机科学, 2021, 48(6A): 15 -20 .