计算机科学 ›› 2021, Vol. 48 ›› Issue (10): 334-342.doi: 10.11896/jsjkx.210300304

• 交叉&前沿 • 上一篇    下一篇

故障场景下的边缘计算DAG任务重调度方法

蔡凌峰1, 魏祥麟2, 邢长友1, 邹霞1, 张国敏1   

  1. 1 陆军工程大学指挥控制工程学院 南京210007
    2 国防科技大学第六十三研究所 南京210007
  • 收稿日期:2021-03-31 修回日期:2021-04-28 出版日期:2021-10-15 发布日期:2021-10-18
  • 通讯作者: 邢长友(changyouxing@126.com)
  • 作者简介:sawakita1122@foxmail.com

Failure-resilient DAG Task Rescheduling in Edge Computing

CAI Ling-feng1, WEI Xiang-lin2, XING Chang-you1, ZOU Xia1, ZHANG Guo-min1   

  1. 1 The Command & Control Engineering College,Army Engineering University,Nanjing 210007,China
    2 The 63rd Research Institute,National University of Defense Technology,Nanjing 210007,China
  • Received:2021-03-31 Revised:2021-04-28 Online:2021-10-15 Published:2021-10-18
  • About author:CAI Ling-feng,born in 1995,postgra-duate.His main research interests include edge computing,cyberspace security and machine learning.
    XING Chang-you,born in 1982,Ph.D,associate professor.His main research interests include software defined network,network measurement.

摘要: 边缘计算将计算和存储资源部署在靠近数据源的网络边缘,并高效调度用户卸载的任务,从而极大地提升了用户的服务体验(Quality of Experience,QoE)。但是,边缘计算缺乏可靠的基础设施保护,服务器节点或通信链路的突发故障可能会导致服务失败。为此,建立了边缘计算中的计算节点和通信链路故障模型,并针对依赖型用户任务的调度,提出了资源故障场景下的任务重调度算法DaGTR(Dependency-aware Greedy Task Rescheduling)。DaGTR包括两种子算法,即DaGTR-N和DaGTR-L,分别用于处理节点和链路故障事件。DaGTR能够感知任务的数据依赖关系,并基于贪心方法对所有受故障影响的用户任务进行重调度,以保证每个任务的成功执行。仿真结果显示,所提算法能够有效避免节点或链路故障导致的任务失败,提高了资源故障情况下任务的成功率。

关键词: 边缘计算, 任务调度, 有向无环图, 资源故障

Abstract: By deploying computation and storage resources at the network edge that is close to the data source,and scheduling tasks offloaded by users efficiently,edge computing can greatly improve the quality of experience (QoE) of users.However,due to the lack of the reliable infrastructure support,the failure of edge servers or communication links could easily fail the edge computing service.To handle this problem,we establish the failure models of the computing nodes and communication links in edge computing,and then propose the rescheduling algorithm DaGTR (Dependency-aware Greedy Task Rescheduling) for the scheduling of dependent user tasks in resource failure scenarios.DaGTR includes two sub-algorithms,DaGTR-N and DaGTR-L,which are responsible for handling the node and link failure events respectively.DaGTR can sense the data dependency of tasks,and reschedule the tasks affected by failure events based on greedy method to ensure the successful execution of each task.Simulation results show that the algorithm can effectively avoid the task failure caused by failure events and improve the success rate of tasks in the case of resource failure.

Key words: Directed acyclic graph, Edge computing, Resource failure, Task scheduling

中图分类号: 

  • TP398.08
[1]China Internet Network Information Center.The 45th Statistical Report on the Development of Internet in China [R/OL].Beijing,CNNIC Report,2020.http://www.cac.gov.cn/2020-04/27/c_1589535470378587.htm.
[2]SATYANARAYANAN M.A Brief History of Cloud Offload:A Personal Journey from Odyssey Through Cyber Foraging to Cloudlets [J].ACM SIGMOBILE Mobile Computing and Communications Review,2015,18(4):19-23.
[3]WANG J,PAN J,ESPOSITO F,et al.Edge cloud offloading algorithms:Issues,methods,and perspectives[J].ACM Computing Surveys (CSUR),2019,52(1):1-23.
[4]LIANG J B,ZHANG H N,JIANG C,et al.Research Progress of Task Offloading Based on Deep Reinforcement Learning in Mobile Edge Computing[J].Computer Science,2021,48(7):316-323.
[5]LIU T,FANG L,GAO H H.Survey of Task Offloading in Edge Computing[J].Computer Science,2021,48(1):11-15.
[6]HU Y C,PATEL M,SABELLA D,et al.Mobile edge computing-a key technology towards 5G[J].ETSI White Paper,2015,11(11):1-16.
[7]LI H,LI X H,XIONG Q Y,et al.Edge Computing Enabling Industrial Internet:Architecture,Applications and Challenges[J].Computer Science,2021,48(1):1-10.
[8]BHATTCHARYA A,DE P.Computation offloading from mobile devices:can edge devices perform better than the cloud?[C]//Proceedings of the Third International Workshop on Adaptive Resource Management and Scheduling for Cloud Computing.2016:1-6.
[9]SHI W S,ZHANG X Z,WANG Y F,et al.Edge Computing:State-of-the-Art and Future Directions [J].Journal of Computer Research and Development,2019,56(1):69-89.
[10]COFFMAN E G.Computer and Job-shop Scheduling Theory[J].Oral Surgery Oral Medicine Oral Pathology,1976,5(2):143-149.
[11]LIU L,TAN H S,JIANG H C,et al.Dependent task placement and scheduling with function configuration in edge computing[C]//Proceedings of the International Symposium on Quality of Service.ACM,2019.
[12]HE K,MENG X,PAN Z,et al.A Novel Task-Duplication Based Clustering Algorithm for Heterogeneous Computing Environments[J].IEEE Transactions on Parallel and Distributed Systems,2018,30(1):2-14.
[13]QI Q,WANG J,MA Z,et al.Knowledge-driven Service Offloa-ding Decision for Vehicular Edge Computing:A Deep Reinforcement Learning Approach[J].IEEE Transactions on Vehicular Technology,2019,68(5):4192-4203.
[14]OO T,KO Y B.Application-aware Task Scheduling in Heterogeneous Edge Cloud[C]//International Conference on Information and Communication Technology Convergence (ICTC).IEEE,2019:1316-1320.
[15]LIU H,JIA H,CHEN J,et al.Computing Resource Allocation of Mobile Edge Computing Networks Based on Potential Game Theory[C]//IEEE 4th International Conference on Computer and Communications (ICCC).IEEE,2018.
[16]COLMAN-MEIXNER C,DEVELDER C,TORNATORE M,et al.A Survey on Resiliency Techniques in Cloud Computing Infrastructures and Applications [J].IEEE Communications Surveys & Tutorials,2017,18(3):2244-2281.
[17]MARTINS J,AHMED M,RAICIU C,et al.ClickOS and the art of network function virtualization[C]//Proceedings of the 11th USENIX Conference on Networked Systems Design and Implementation.USENIX Association,2014.
[18]MENG J Y,TAN H S,LI X Y,et al.Online Deadline-aware Task Dispatching and Scheduling in Edge Computing[J].IEEE Transactions on Parallel and Distributed Systems,2020,31(6):1270-1286.
[19]MCKEOWN N,ANDERSON T,BALAKRISHNAN H,et al.OpenFlow[J].ACM SIGCOMM Computer Communication Review,2008,38(2):69.
[20]TALEB T,KSENTINI A,SERICOLA B.On Service Resilience in Cloud-native 5G Mobile Systems[J].IEEE Journal on Selec-ted Areas in Communications,2016,34(3):1-1.
[21]KANIZO Y,ROTTENSTREICH O,SEGALL I,et al.Optimizing Virtual Backup Allocation for Middleboxes[J].IEEE/ACM Transactions on Networking,2017,25(5):2759-2772.
[1] 孙慧婷, 范艳芳, 马孟晓, 陈若愚, 蔡英.
VEC中基于动态定价的车辆协同计算卸载方案
Dynamic Pricing-based Vehicle Collaborative Computation Offloading Scheme in VEC
计算机科学, 2022, 49(9): 242-248. https://doi.org/10.11896/jsjkx.210700166
[2] 于滨, 李学华, 潘春雨, 李娜.
基于深度强化学习的边云协同资源分配算法
Edge-Cloud Collaborative Resource Allocation Algorithm Based on Deep Reinforcement Learning
计算机科学, 2022, 49(7): 248-253. https://doi.org/10.11896/jsjkx.210400219
[3] 李梦菲, 毛莺池, 屠子健, 王瑄, 徐淑芳.
基于深度确定性策略梯度的服务器可靠性任务卸载策略
Server-reliability Task Offloading Strategy Based on Deep Deterministic Policy Gradient
计算机科学, 2022, 49(7): 271-279. https://doi.org/10.11896/jsjkx.210600040
[4] 袁昊男, 王瑞锦, 郑博文, 吴邦彦.
基于Fabric的电子病历跨链可信共享系统设计与实现
Design and Implementation of Cross-chain Trusted EMR Sharing System Based on Fabric
计算机科学, 2022, 49(6A): 490-495. https://doi.org/10.11896/jsjkx.210500063
[5] 方韬, 杨旸, 陈佳馨.
D2D辅助移动边缘计算下的卸载策略优化
Optimization of Offloading Decisions in D2D-assisted MEC Networks
计算机科学, 2022, 49(6A): 601-605. https://doi.org/10.11896/jsjkx.210200114
[6] 刘漳辉, 郑鸿强, 张建山, 陈哲毅.
多无人机使能移动边缘计算系统中的计算卸载与部署优化
Computation Offloading and Deployment Optimization in Multi-UAV-Enabled Mobile Edge Computing Systems
计算机科学, 2022, 49(6A): 619-627. https://doi.org/10.11896/jsjkx.210600165
[7] 谢万城, 李斌, 代玥玥.
空中智能反射面辅助边缘计算中基于PPO的任务卸载方案
PPO Based Task Offloading Scheme in Aerial Reconfigurable Intelligent Surface-assisted Edge Computing
计算机科学, 2022, 49(6): 3-11. https://doi.org/10.11896/jsjkx.220100249
[8] 周天清, 岳亚莉.
超密集物联网络中多任务多步计算卸载算法研究
Multi-Task and Multi-Step Computation Offloading in Ultra-dense IoT Networks
计算机科学, 2022, 49(6): 12-18. https://doi.org/10.11896/jsjkx.211200147
[9] 彭冬阳, 王睿, 胡谷雨, 祖家琛, 王田丰.
视频缓存策略中QoE和能量效率的公平联合优化
Fair Joint Optimization of QoE and Energy Efficiency in Caching Strategy for Videos
计算机科学, 2022, 49(4): 312-320. https://doi.org/10.11896/jsjkx.210800027
[10] 田冰川, 田臣, 周宇航, 陈贵海, 窦万春.
减少Hadoop集群中网络队头阻塞的调度算法
Reducing Head-of-Line Blocking on Network in Hadoop Clusters
计算机科学, 2022, 49(3): 11-22. https://doi.org/10.11896/jsjkx.210900117
[11] 张海波, 张益峰, 刘开健.
基于NOMA-MEC的车联网任务卸载、迁移与缓存策略
Task Offloading,Migration and Caching Strategy in Internet of Vehicles Based on NOMA-MEC
计算机科学, 2022, 49(2): 304-311. https://doi.org/10.11896/jsjkx.210100157
[12] 林潮伟, 林兵, 陈星.
边缘环境下基于模糊理论的科学工作流调度研究
Study on Scientific Workflow Scheduling Based on Fuzzy Theory Under Edge Environment
计算机科学, 2022, 49(2): 312-320. https://doi.org/10.11896/jsjkx.201000102
[13] 谭双杰, 林宝军, 刘迎春, 赵帅.
基于机器学习的分布式星载RTs系统负载调度算法
Load Scheduling Algorithm for Distributed On-board RTs System Based on Machine Learning
计算机科学, 2022, 49(2): 336-341. https://doi.org/10.11896/jsjkx.201200126
[14] 沈彪, 沈立炜, 李弋.
空间众包任务的路径动态调度方法
Dynamic Task Scheduling Method for Space Crowdsourcing
计算机科学, 2022, 49(2): 231-240. https://doi.org/10.11896/jsjkx.210400249
[15] 梁俊斌, 张海涵, 蒋婵, 王天舒.
移动边缘计算中基于深度强化学习的任务卸载研究进展
Research Progress of Task Offloading Based on Deep Reinforcement Learning in Mobile Edge Computing
计算机科学, 2021, 48(7): 316-323. https://doi.org/10.11896/jsjkx.200800095
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!