计算机科学 ›› 2025, Vol. 52 ›› Issue (6A): 240800050-8.doi: 10.11896/jsjkx.240800050

• 网络&通信 • 上一篇    下一篇

基于元强化学习的任务卸载优化策略

赵婵婵, 杨星辰, 石宝, 吕飞, 刘利彬   

  1. 内蒙古工业大学信息工程学院 呼和浩特 010080
  • 出版日期:2025-06-16 发布日期:2025-06-12
  • 通讯作者: 石宝(kshibao@163.com)
  • 作者简介:(cczhao@imut.edu.cn)
  • 基金资助:
    内蒙古自治区自然科学基金项目(2023LHMS06016);内蒙古自治区直属高校基本科研业务费项目(JY20240010,JY20230082)

Optimization Strategy of Task Offloading Based on Meta Reinforcement Learning

ZHAO Chanchan, YANG Xingchen, SHI Bao, LYU Fei, LIU Libin   

  1. School of Information Engineering,Inner Mongolia University of Technology,Hohhot 010080,China
  • Online:2025-06-16 Published:2025-06-12
  • About author:ZHAO Chanchan,born in 1982,Ph.D,associate professor.Her main research interests include mobile edge computing and blockchain.
    SHI Bao,born in 1982,Ph.D,associate professor.His main research interest isimage processing.
  • Supported by:
    Natural Science Foundation of Inner Mongolia Autonomous Region(2023LHMS06016) and Basic Scientific Research Business Fee Project of Universities Directly Under the Inner Mongolia Autonomous Region(JY20240010,JY20230082).

摘要: 随着边缘计算的蓬勃发展,任务卸载已成为提升系统性能和资源利用率的关键策略。现有的基于深度学习的卸载方法在实际应用中面临样本效率低及对新环境适应性差等问题。为此,提出了基于元强化学习的任务卸载方法(MRL-PPO),旨在有效解决边缘计算中异构任务的高效卸载问题,最大限度地减少任务的延迟和能耗。设计了结合注意力机制的序列到序列(Seq2Seq)的网络,将卸载任务的应用程序建模为DAG,编码器对卸载的任务进行编码,解码器根据上下文向量输出不同的卸载决策,以解决任务序列维度不同导致的网络训练复杂问题,注意力机制使得模型能够动态关注卸载任务的关键特征,提高决策的精确性和效率。为了优化PPO算法在复杂环境中的性能,引入了内在奖励学习算法。实验结果表明,与现有方法相比,所提算法在不同任务下有更优异的性能,能够快速适应新的环境,并且有效降低任务处理过程中的延迟和能耗。

关键词: 边缘计算, 元强化学习, 任务卸载, Seq2Seq网络, 注意力机制

Abstract: With the rapid development of edge computing,task offloading has become a crucial strategy for enhancing system performance and resource utilization.Existing deep learning-based offloading methods face challenges in real-world applications,such as low sample efficiency and poor adaptability to new environments.To address these issues,a task offloading method based on meta-reinforcement learning(MRL-PPO) is proposed,aiming to effectively solve the efficient offloading of heterogeneous tasks in edge computing while minimizing task delay and energy consumption.A sequence-to-sequence(Seq2Seq) network with an attention mechanism is designed,modeling offloading tasks as a directed acyclic graph(DAG).The encoder encodes the offloading tasks,and the decoder outputs different offloading decisions based on the context vector,addressing the complexity of network training caused by varying task sequence dimensions.The attention mechanism allows the model to dynamically focus on key features of the offloading tasks,improving decision accuracy and efficiency.To optimize the performance of the PPO algorithm in complex environments,an intrinsic reward learning algorithm is introduced.Experimental results demonstrate that the proposed algorithm outperforms existing methods in different tasks,and can quickly adapt to new environments,effectively reducing delay and energy consumption during task processing.

Key words: Edge computing, Meta reinforcement learning, Task offloading, Seq2Seq network, Attention mechanism

中图分类号: 

  • TP391
[1]YANG W Y,JIA X,SHAO C.Meta-Reinforcement Learning-based Vehicle Task Offloading in Internet of Vehicles(IoV)[C]//2023 3rd International Symposium on ComputerTechno-logy and Information Science(ISCTIS).Chengdu,China,2023:330-333.
[2]SHI W S,SUN H,CAO J,et al.Edge Computing-An Emerging Computing Model for the Internet of Everything Era [J].Journal of Computer Research and Development,2017,54(5):907-924.
[3]SHARMA N,GHOSH A,MISRA R,et al.Deep Meta Q-Learning Based Multi-Task Offloading in Edge-Cloud Systems[J].IEEE Transactions on Mobile Computing,2024,23(4):2583-2598.
[4]HO T M,NGUYEN K K.Joint Server Selection,CooperativeOffloading and Handover in Multi-Access Edge Computing Wireless Network:A Deep Reinforcement Learning Approach[J].IEEE Transactions on Mobile Computing,2022,21(7):2421-2435.
[5]TRINH B,MUNTEAN G M.A Deep Reinforcement Learning-Based Offloading Scheme for Multi-Access Edge Computing-Supported eXtended Reality Systems[J].IEEE Transactions on Vehicular Technology,2023,72(1):1254-1264.
[6]FEMENIAS G,RIERA-PALOU F.Mobile Edge Computing Aided Cell-Free Massive MIMO Networks[J].IEEE Transactions on MOBILE Computing,2024,23(2):1246-1261.
[7]JIANG H,SHI D X,XUE,CHAO,et al.Multi-agent deep reinforcement learning with type-based hierarchical group communication[J].Applied Intelligence,2021,51(8):5793-5808.
[8]SONG T.Opportunistic Task Offloading in UAV-assisted Mobile Edge Computing:A Deep Reinforcement Learning Approach[C]//2023 14th International Conference on Information and Communication Technology Convergence(ICTC).Jeju Island,Korea,Republic of,2023:881-884.
[9]WANG J,WANG H.A Secure Data Offloading Strategy forUAV Wireless Networks Based on Improved Ant Colony Algorithms[C]//2022 3rd International Conference on Electronics,Communications and Information Technology(CECIT).Sanya,China,2022:57-61.
[10]LV W,YANG P,ZHENG T,et al.Energy Consumption and QoS-Aware Co-Offloading for Vehicular Edge Computing[J].IEEE Internet of Things Journal,2023,10(6):5214-5225.
[11]WU H X,GENG J W,BAI X J,et al.Deep reinforcement lear-ning-based online task offloading in mobile edge computing networks[J].Information Sciences,2024,654.
[12]GONG B C,JIANG X W.Dependent Task-Offloading Strategy Based on Deep Reinforcement Learning in Mobile Edge Computing[J].Wireless Communications & Mobile Computing,2023,2023.
[13]CHEN J W,YANG Y J,WANG C Y,et al.Multitask Offloading Strategy Optimization Based on Directed Acyclic Graphs for Edge Computing[J].IEEE Internet of Things Journal,2021,9(12):9367-9378.
[14]WANG J,HU J,MIN G Y,et al.Dependent Task Offloading for Edge Computing based on Deep Reinforcement Learning[J].IEEE Transactions on Computers,2022,71(10):2449-2461.
[15]GUO M,HU X,CHEN Y R,et al.Joint Scheduling and Offloading Schemes for Multiple Interdependent Computation Tasks in Mobile Edge Computing[J].IEEE Internet of Things Journal,2024,11(4):5718-5730.
[16]HOMA M,MEHMET B,LUTFIYE D A.Handover-EnabledDynamic Computation Offloading for Vehicular Edge Computing Networks[J].IEEE Transactions on Vehicular Technology,2023,72(7):9394-9405.
[17]LI Y N,LI J B,LV Z Q,et al.GASTO:A Fast Adaptive Graph Learning Framework for Edge Computing Empowered Task Offloading[J].IEEE Transactions on Network and Service Management,2023,20(2):932-944.
[18]ARABNEJAD H,BARBOSA J.List Scheduling Algorithm for Heterogeneous Systems by an Optimistic Cost Table[J].IEEE Transactions on Parallel and Distributed Systems,2013,25(3):682-694.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!