计算机科学 ›› 2021, Vol. 48 ›› Issue (7): 316-323.doi: 10.11896/jsjkx.200800095
梁俊斌1,2, 张海涵1,2, 蒋婵3, 王天舒4
LIANG Jun-bin1,2, ZHANG Hai-han1,2, JIANG Chan3, WANG Tian-shu4
摘要: 移动边缘计算是近年出现的一种新型网络计算模式,它允许将具有较强计算能力和存储性能的服务器节点放置在更加靠近移动设备的网络边缘(如基站附近),让移动设备可以近距离地卸载任务到边缘设备进行处理,从而解决了传统网络由于移动设备的计算和存储能力弱且能量较有限,从而不得不耗费大量时间、能量且不安全地将任务卸载到远方的云平台进行处理的弊端。但是,如何让仅掌握局部有限信息(如邻居数量)的设备根据任务的大小和数量选择卸载任务到本地,还是在无线信道随时间变化的动态网络中选择延迟、能耗均最优的移动边缘计算服务器进行全部或部分的任务卸载,是一个多目标规划问题,求解难度较高。传统的优化技术(如凸优化等)很难获得较好的结果。而深度强化学习是一种将深度学习与强化学习相结合的新型人工智能算法技术,能够对复杂的协作、博弈等问题作出更准确的决策,在工业、农业、商业等多个领域具有广阔的应用前景。近年来,利用深度强化学习来优化移动边缘计算网络中的任务卸载成为一种新的研究趋势。最近三年来,一些研究者对其进行了初步的探索,并达到了比以往单独使用深度学习或强化学习更低的延迟和能耗,但是仍存在很多不足之处。为了进一步推进该领域的研究,文中对近年来国内外的相关工作进行了详细地分析、对比和总结,归纳了它们的优缺点,并对未来可能深入研究的方向进行了讨论。
中图分类号:
[1]YOU C,HUANG K,CHAE H,et al.Energy-Efficient Resource Allocation for Mobile-Edge Computation Offloading[J].IEEE Transactions on Wireless Communications,2017,16(3):1397-1411. [2]JEONG S,SIMEONE O,KANG J,et al.Mobile Edge Computing via a UAV-Mounted Cloudlet:Optimization of Bit Allocation and Path Planning[J].IEEE Transactions on Vehicular Technology,2018,67(3):2049-2063. [3]SARDELLITTI S,SCUTARI G,BARBAROSSA S,et al.Joint Optimization of Radio and Computational Resources for Multicell Mobile-Edge Computing[J].IEEE Transactions on Signal and Information Processing Over Networks,2015,1(2):89-103. [4]CHEN Y,ZHANG N,ZHANG Y,et al.TOFFEE:Task Off-loading and Frequency Scaling for Energy Efficiency of Mobile Devices in Mobile Edge Computing[J].IEEE Transactions on Cloud Computing,2019(99):1-1. [5]REN J,YU G,CAI Y,et al.Latency Optimization for Resource Allocation in Mobile-Edge Computation Offloading[J].IEEE Transactions on Wireless Communications,2018,17(8):5506-5519. [6]TALEB T,SAMDANIS K,MADA B,et al.On Multi-Access Edge Computing:A Survey of the Emerging 5G Network Edge Cloud Architecture and Orchestration[J].IEEE Communications Surveys and Tutorials,2017,19(3):1657-1681. [7]TRAN T X,HAJISAMI A,PANDEY P,et al.CollaborativeMobile Edge Computing in 5G Networks:New Paradigms,Scenarios,and Challenges[J].IEEE Communications Magazine,2017,55(4):54-61. [8]PAPADIMITRIOU C H,TSITSIKLIS J N.The complexity of Markov decision processes[J].Mathematics of Operations Research,1987,12(3):441-450. [9]CHEN Y,ZHANG N,ZHANG Y,et al.Energy efficient dynamic offloading in mobile edge computing for internet of things[J/OL].IEEE Transactions on Cloud Computing,2019.http://www.semanticscholar.org/paper/Energy-Efficient-Dynamic-Offloading-in-Mobile-Edge-Chen-Zhang/fbe4cb7777cdd2485d1e5fb0072c896b045027fc. [10]MNIH V,KAVUKCUOGLU K,SILVER D,et al.Human-level control through deep reinforcement learning[J].Nature,2015,518(7540):529-533. [11]ARULKUMARAN K,DEISENROTH M P,BRUNDAGE M,et al.Deep Reinforcement Learning:A Brief Survey[J].IEEE Signal Processing Magazine,2017,34(6):26-38. [12]HE Y,ZHAO N,YIN H,et al.Integrated Networking,Cac-hing,and Computing for Connected Vehicles:A Deep Reinforcement Learning Approach[J].IEEE Transactions on Vehicular Technology,2018,67(1):44-55. [13]WANG C,LIANG C,YU F R,et al.Computation Offloadingand Resource Allocation in Wireless Cellular Networks With Mobile Edge Computing[J].IEEE Transactions on Wireless Communications,2017,16(8):4924-4938. [14]ZHOU Y,YU F R,CHEN J,et al.Resource Allocation for Information-Centric Virtualized Heterogeneous Networks With In-Network Caching and Mobile Edge Computing[J].IEEE Transactions on Vehicular Technology,2017,66(12):11339-11351. [15]YOU C,HUANG K,CHAE H,et al.Energy-Efficient Resource Allocation for Mobile-Edge Computation Offloading[J].IEEE Transactions on Wireless Communications,2017,16(3):1397-1411. [16]LYU X,TIAN H,NI W,et al.Energy-Efficient Admission of Delay-Sensitive Tasks for Mobile Edge Computing[J].IEEE Transactions on Communications,2018,66(6):2603-2616. [17]ZHAO P,TIAN H,FAN S,et al.Information Prediction andDynamic Programming-Based RAN Slicing for Mobile Edge Computing[J].IEEE Wireless Communications Letters,2018,7(4):614-617. [18]LI J,GAO H,LYU T,et al.Deep reinforcement learning based computation offloading and resource allocation for MEC[C]//Wireless Communications and Networking Conference.2018:1-6. [19]HUANG L,BI S,ZHANG Y A,et al.Deep ReinforcementLearning for Online Computation Offloading in Wireless Powered Mobile-Edge Computing Networks[J/OL].http://arXiv.org/abs/1808.01977v6. [20]CHEN X,ZHANG H,WU C,et al.Optimized Computation Offloading Performance in Virtual Edge Computing Systems Via Deep Reinforcement Learning[J].IEEE Internet of Things Journal,2019,6(3):4005-4018. [21]HUANG L,FENG X,ZHANG C,et al.Deep reinforcementlearning-based joint task offloading and bandwidth allocation for multi-user mobile edge computing[J].Digital Communications and Networks,2019,5(1):10-17. [22]YAO P,CHEN X,CHEN Y,et al.Deep reinforcement learning based offloading scheme for mobile edge computing[C]//2019 IEEE International Conference on Smart Internet of Things (SmartIoT).IEEE,2019:417-421. [23]HE Y,YU F R,ZHAO N,et al.Software-Defined Networkswith Mobile Edge Computing and Caching for Smart Cities:A Big Data Deep Reinforcement Learning Approach[J].IEEE Communications Magazine,2017,55(12):31-37. [24]CHEN M,LIANG B,DONG M,et al.Joint offloading decision and resource allocation for multi-user multi-task mobile cloud[C]//International Conference on Communications.2016:1-6. [25]VAN HASSELT H,GUEZ A,SILVER D,et al.Deep reinforcement learning with double Q-Learning[C]//National Conference On Artificial Intelligence.2016:2094-2100. [26]MIN M,XIAO L,CHEN Y,et al.Learning-Based Computation Offloading for IoT Devices With Energy Harvesting[J].IEEE Transactions on Vehicular Technology,2019,68(2):1930-1941. [27]NING Z,DONG P,WANG X,et al.Deep reinforcement learning for vehicular edge computing: An intelligent offloading system[J].ACM Transactions on Intelligent Systems and Technology (TIST),2019,10(6):1-24. [28]LU H,GU C,LUO F,et al.Optimization of lightweight task offloading strategy for mobile edge computing based on deep reinforcement learning[J].Future Generation Computer Systems,2020,102:847-861. [29]PAN S J,YANG Q.A Survey on Transfer Learning[J].IEEETransactions on Knowledge and Data Engineering,2010,22(10):1345-1359. [30]SRIVASTAVA N,HINTON G E,KRIZHEVSKY A,et al.Dropout:a simple way to prevent neural networks from overfitting[J].Journal of Machine Learning Research,2014,15(1):1929-1958. [31]HOCHREITER S,SCHMIDHUBER J.Long short-term memory[J].Neural Computation,1997,9(8):1735-1780. [32]GUPTA H,DASTJERDI A V,GHOSH S K,et al.iFogSim:A toolkit for modeling and simulation of resource management techniques in the Internet of Things,Edge and Fog computing environments[J].Software-Practice and Experience,2017,47(9):1275-1296. [33]HUANG B,LI Y,LI Z,et al.Security and Cost-Aware Computation Offloading via Deep Reinforcement Learning in Mobile Edge Computing[J].Wireless Communications and Mobile Computing,2019(2019):1-20. [34]ZHANG K,ZHU Y,LENG S,et al.Deep Learning Empowered Task Offloading for Mobile Edge Computing in Urban Infor-matics[J].IEEE Internet of Things Journal,2019,6(5):7635-7647. [35]XU X,ZHANG X,GAO H,et al.BeCome:Blockchain-Enabled Computation Offloading for IoT in Mobile Edge Computing[J].IEEE Transactions on Industrial Informatics,2020,16(6):4187-4195. [36]MAURICE N,PHAM Q V,HWANG W J.Online Computation Offloading in NOMA-based Multi-Access Edge Computing:A Deep Reinforcement Learning Approach[J].IEEE Access,2020(99):1-1. [37]ALFAKIH T,HASSAN M M,GUMAEI A,et al.Task Off-loading and Resource Allocation for Mobile Edge Computing by Deep Reinforcement Learning Based on SARSA[J].IEEE Access,2020:8:54074-54084. [38]ZHANG H,WU W,WANG C,et al.Deep ReinforcementLearning-Based Offloading Decision Optimization in Mobile Edge Computing[C]//Wireless Communications and Networking Conference.2019:1-7. [39]LIU Y,CUI Q,ZHANG J,et al.An Actor-Critic Deep Rein-forcement Learning Based Computation Offloading for Three-Tier Mobile Computing Networks[C]//International Confe-rence on Wireless Communications and Signal Processing.2019:1-6. [40]MNIH V,BADIA A P,MIRZA M,et al.Asynchronous methods for deep reinforcement learning[C]// International Conference on Machine Learning.2016:1928-1937. [41]ZHAN W,LUO C,WANG J,et al.Deep Reinforcement Lear-ning-Based Offloading Scheduling for Vehicular Edge Computing[J].IEEE Internet of Things Journal,2020:7(6):5449-5465. [42]SCHULMAN J,WOLSKI F,DHARIWAL P,et al.ProximalPolicy Optimization Algorithms[J].arXiv:1707.06347,2017. [43]ZHANG T,CHIANG Y,BORCEA C,et al.Learning-Based Offloading of Tasks with Diverse Delay Sensitivities for Mobile Edge Computing[C]//Global Communications Conference.2019. [44]FENG J,YU F R,PEI Q,et al.Cooperative Computation Offloading and Resource Allocation for Blockchain-Enabled Mobile Edge Computing:A Deep Reinforcement Learning Approach[J].IEEE Internet of Things Journal,2019:1-1. [45]XIONG Z,ZHANG Y,NIYATO D,et al.When Mobile Blockchain Meets Edge Computing[J].IEEE Communications Magazine,2018,56(8):33-39. [46]LILLICRAP T,HUNT J J,PRITZEL A,et al.Continuous control with deep reinforcement learning[C]//InternationalConfe-rence on Learning Representations.2016. [47]CHEN Z,WANG X D.Decentralized Computation Offloadingfor Multi-User Mobile Edge Computing:A Deep Reinforcement Learning Approach[J].arXiv:1812.07394,2018. [48]SILVER D,LEVER G,HEESS N,et al.Deterministic PolicyGradient Algorithms[C]//International Conference on Machine Learning.2014:387-395. [49]QIU X,LIU L,CHEN W,et al.Online Deep ReinforcementLearning for Computation Offloading in Blockchain-Empowered Mobile Edge Computing[J].IEEE Transactions on Vehicular Technology,2019,68(8):8050-8062. [50]SRINIVAS M,PATNAIK L M.Adaptive probabilities of cros-sover and mutation in genetic algorithms[J].IEEE Transactions on Systems,Man,and Cybernetics,2002,24(4):656-667. [51]SCHAUL T,QUAN J,ANTONOGLOU I,et al.Prioritized Experience Replay[J/OL].http://arXiv:org/bas/1511.05952v4. [52]HE X M,LU H D,HUANG H W,et al.QoE-Based Cooperative Task Offloading with Deep Reinforcement Learning in Mobile Edge Networks[J].IEEE Wireless Communications,2020,27(3):111-117. [53]VAN HUYNH N,HOANG D T,NGUYEN D N,et al.Optimal and Fast Real-Time Resource Slicing With Deep Dueling Neural Networks[J].IEEE Journal on Selected Areas in Communications,2019,37(6):1455-1470. [54]REN Y L,YU X M,CHEN X Y,et al.Vehicular Network Edge Intelligent Management:A Deep Deterministic Policy Gradient Approach for Service Offloading Decision[C]//IWCMC.2020:905-910. |
[1] | 徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺. 时序知识图谱表示学习 Temporal Knowledge Graph Representation Learning 计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204 |
[2] | 熊丽琴, 曹雷, 赖俊, 陈希亮. 基于值分解的多智能体深度强化学习综述 Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization 计算机科学, 2022, 49(9): 172-182. https://doi.org/10.11896/jsjkx.210800112 |
[3] | 饶志双, 贾真, 张凡, 李天瑞. 基于Key-Value关联记忆网络的知识图谱问答方法 Key-Value Relational Memory Networks for Question Answering over Knowledge Graph 计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277 |
[4] | 刘兴光, 周力, 刘琰, 张晓瀛, 谭翔, 魏急波. 基于边缘智能的频谱地图构建与分发方法 Construction and Distribution Method of REM Based on Edge Intelligence 计算机科学, 2022, 49(9): 236-241. https://doi.org/10.11896/jsjkx.220400148 |
[5] | 汤凌韬, 王迪, 张鲁飞, 刘盛云. 基于安全多方计算和差分隐私的联邦学习方案 Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy 计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108 |
[6] | 孙奇, 吉根林, 张杰. 基于非局部注意力生成对抗网络的视频异常事件检测方法 Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection 计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061 |
[7] | 袁唯淋, 罗俊仁, 陆丽娜, 陈佳星, 张万鹏, 陈璟. 智能博弈对抗方法:博弈论与强化学习综合视角对比分析 Methods in Adversarial Intelligent Game:A Holistic Comparative Analysis from Perspective of Game Theory and Reinforcement Learning 计算机科学, 2022, 49(8): 191-204. https://doi.org/10.11896/jsjkx.220200174 |
[8] | 史殿习, 赵琛然, 张耀文, 杨绍武, 张拥军. 基于多智能体强化学习的端到端合作的自适应奖励方法 Adaptive Reward Method for End-to-End Cooperation Based on Multi-agent Reinforcement Learning 计算机科学, 2022, 49(8): 247-256. https://doi.org/10.11896/jsjkx.210700100 |
[9] | 王剑, 彭雨琦, 赵宇斐, 杨健. 基于深度学习的社交网络舆情信息抽取方法综述 Survey of Social Network Public Opinion Information Extraction Based on Deep Learning 计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099 |
[10] | 郝志荣, 陈龙, 黄嘉成. 面向文本分类的类别区分式通用对抗攻击方法 Class Discriminative Universal Adversarial Attack for Text Classification 计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077 |
[11] | 姜梦函, 李邵梅, 郑洪浩, 张建朋. 基于改进位置编码的谣言检测模型 Rumor Detection Model Based on Improved Position Embedding 计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046 |
[12] | 胡艳羽, 赵龙, 董祥军. 一种用于癌症分类的两阶段深度特征选择提取算法 Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification 计算机科学, 2022, 49(7): 73-78. https://doi.org/10.11896/jsjkx.210500092 |
[13] | 程成, 降爱莲. 基于多路径特征提取的实时语义分割方法 Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction 计算机科学, 2022, 49(7): 120-126. https://doi.org/10.11896/jsjkx.210500157 |
[14] | 侯钰涛, 阿布都克力木·阿布力孜, 哈里旦木·阿布都克里木. 中文预训练模型研究进展 Advances in Chinese Pre-training Models 计算机科学, 2022, 49(7): 148-163. https://doi.org/10.11896/jsjkx.211200018 |
[15] | 周慧, 施皓晨, 屠要峰, 黄圣君. 基于主动采样的深度鲁棒神经网络学习 Robust Deep Neural Network Learning Based on Active Sampling 计算机科学, 2022, 49(7): 164-169. https://doi.org/10.11896/jsjkx.210600044 |
|