计算机科学 ›› 2021, Vol. 48 ›› Issue (11): 363-371.doi: 10.11896/jsjkx.201000008

• 计算机网络 • 上一篇    下一篇

基于强化学习的高能效基站动态调度方法

曾德泽1, 李跃鹏1, 赵宇阳1, 顾琳2   

  1. 1 中国地质大学(武汉)计算机学院 武汉430074
    2 华中科技大学计算机科学与技术学院 武汉430074
  • 收稿日期:2020-10-03 修回日期:2021-04-10 出版日期:2021-11-15 发布日期:2021-11-10
  • 通讯作者: 顾琳(anheeno@gmail.com)
  • 作者简介:deze@cug.edu.cn
  • 基金资助:
    国家自然基金(61772480,61972171,62073300);之江实验室开放课题项目基金(2021KE0AB02)

Reinforcement Learning Based Dynamic Basestation Orchestration for High Energy Efficiency

ZENG De-ze1, LI Yue-peng1, ZHAO Yu-yang1, GU Lin2   

  1. 1 School of Computer Science,China University of Geosciences,Wuhan 430074,China
    2 School of Computer Science and Technology,Huazhong University of Science and Technology,Wuhan 430074,China
  • Received:2020-10-03 Revised:2021-04-10 Online:2021-11-15 Published:2021-11-10
  • About author:ZENG De-ze,born in 1984,Ph.D,professor,Ph.D supervisor,is a member of China Computer Federation.His main research interests include edge computing and artificial intelligence.
    GU Lin,born in 1985,Ph.D,associate professor,is a member of China Computer Federation.Her main research interests include edge computing and artificial intelligence.
  • Supported by:
    National Natural Science Foundation of China(61772480,61972171,62073300) and Open Research Projects of Zhijiang Lab(2021KE0AB02).

摘要: 随着移动通信技术的升级与移动通信产业的兴起,移动互联网正蓬勃发展。然而,由于移动设备爆发式增长,网络规模不断扩大和用户对服务质量的要求的不断提高,移动互联网络正面临着下一场技术革命。虽然5G技术可以通过密集的网络部署来实现千百倍的网络性能提升,但同信道干扰和高突发性的用户请求等问题使得该方案下需要消耗巨大的能量。为了在 5G 网络中提供高性能服务,升级改进现有网络管理方案势在必行。针对这些问题,使用带缓存队列的短周期管理框架实现对请求突发场景的敏捷平滑管理,避免由突发性请求导致的服务质量剧烈波动。此外,采用深度强化学习方法对用户分布、通信需求等进行自我学习,从而推测出基站的负载变化规律,进而实现对能量的预调度和预分配,在保证服务质量的同时提高能量的利用率。文中提出的双缓冲 DQN 算法在收敛速度上比传统 DQN 算法提高了近20%,且与当前广泛使用的基站常开策略相比,该算法能够节约4.8%的能量消耗。

关键词: 基站休眠, 请求突发, 深度强化学习, 双缓冲区, 移动网络管理, 异构网络

Abstract: The mutual promotion of mobile communication technology and mobile communication industry has achieved unprecedented prosperity in the mobile Internet era.The explosion of mobile devices,expansion of the network scale,improvement of service requirements are driving the next technological revolution in wireless networks.5G meets the requirements for the thousand-fold improvement of service performance through intensive network deployment,but co-channel interference and bursty request problems make the energy consumption of this solution very huge.In order to support 5G network to provide energy-efficient and high-performance services,it is imperative to upgrade and improve the management scheme of mobile networks.In this article,we use a short-cycle management framework with cache queues to achieve agile and smooth management of request burst scenarios to avoid dramatic fluctuations in service quality due to request bursts.We use deep reinforcement learning to learn the user distribution and communication needs,and infer the load change rules of the base station,and then realize the pre-scheduling and pre-allocation of energy,while ensuring the quality of service and improving the energy efficiency.Compared with the classic DQN algorithm,the two-buffer DQN algorithm proposed in this paper can provide nearly 20% acceleration in convergence.In terms of decision performance,it can save 4.8% energy consumption compared to the currently widely used keep on strategy.

Key words: Base station sleep, Deep reinforcement learning, Double buffer, Heterogeneous network, Mobile network management, Request burst

中图分类号: 

  • TP391
[1]OH E,KRISHNAMACHARI B,LIU X,et al.Toward dynamic energy efficient operation of cellular network infrastructure[J].IEEE Communications Magazine,2011,49(6):56-61.
[2]LÄHDEKORPI P,HRONEC M,JOLMA P,et al.Energy efficiency of 5G mobile networks with base station sleep modes[C]//2017 IEEE Conference on Standards for Communications and Networking (CSCN).IEEE,2017:163-168.
[3]ONIRETI O,MOHAMED A,PERVAIZ H,et al.Analytical approach to base station sleep mode power consumption and sleep depth[C]//2017 IEEE 28th Annual International Symposium on Personal,Indoor,and Mobile Radio Communications (PIMRC).IEEE,2017:1-7.
[4]ONIRETI O,MOHAMED A,PERVAIZ H,et al.A tractable ap-proach to base station sleep mode power consumption and deactivation latency[C]//2018 IEEE 29th Annual International Symposium on Personal,Indoor and Mobile Radio Communications (PIMRC).IEEE,2018:123-128.
[5]PERVAIZ H,ONIRETI O,MOHAMED A,et al.Energy-efficient and load-proportional eNodeB for 5G user-centric networks:a multilevel sleep strategy mechanism[J].IEEE Vehicular Technology Magazine,2018,13(4):51-59.
[6]LI J,WANG H,WANG X,et al.Optimized sleep strategy based on clustering in dense heterogeneous networks[J].EURASIP Journal on Wireless Communications and Networking,2018,2018(1):1-10.
[7]RATHEESH R,VETRIVELAN P.Energy efficiency based onrelay station deployment and sleep mode activation of eNBs for 4G LTE-A network[J].Automatika,2019,60(3):322-331.
[8]KLAPEZ M,GRAZIA C A,CASONI M.Energy Savings ofSleep Modes Enabled by 5G Software-Defined Heterogeneous Networks[C]//2018 IEEE 4th International Forum on Research and Technology for Society and Industry (RTSI).IEEE,2018:1-6.
[9]JAWAD A M,JAWAD H M,NORDIN R,et al.Wireless power transfer with magnetic resonator coupling and sleep/active strategy for a drone charging station in smart agriculture[J].IEEE Access,2019,7:139839-139851.
[10]WU J,BAO Y,MIAO G,et al.Base station sleeping and power control for bursty traffic in cellular networks[C]//2014 IEEE International Conference on Communications Workshops (ICC).IEEE,2014:837-841.
[11]WU J,BAO Y,MIAO G,et al.Base-station sleeping control and power matching for energy-delay tradeoffs with bursty traffic[J].IEEE Transactions on Vehicular Technology,2015,65(5):3657-3675.
[12]LIU J,KRISHNAMACHARI B,ZHOU S,et al.Deepnap:Data-driven base station sleeping operations through deep reinforcement learning[J].IEEE Internet of Things Journal,2018,5(6):4273-4282.
[13]GHADIMI E,CALABRESE F D,PETERS G,et al.A reinforcement learning approach to power control and rate adaptation in cellular networks[C]//2017 IEEE International Conference on Communications (ICC).IEEE,2017:1-7.
[14]WANG L,PETERS G,LIANG Y C,et al.Intelligent User-Centric Networks:Learning-Based Downlink CoMP Region Brea-thing[J].IEEE Transactions on Vehicular Technology,2020,69(5):5583-5597.
[15]LIU Q,SHI J.Base station sleep and spectrum allocation in he-terogeneous ultra-dense networks[J].Wireless Personal Communications,2018,98(4):3611-3627.
[16]NIU Z,GUO X,ZHOU S,et al.Characterizing energy-delaytradeoff in hyper-cellular networks with base station sleeping control[J].IEEE Journal on Selected Areas in Communications,2015,33(4):641-650.
[17]CHEN X,WU J,CAI Y,et al.Energy-efficiency oriented traffic offloading in wireless networks:a brief survey and a learning approach for heterogeneous cellular networks[J].IEEE Journal on Selected Areas in Communications,2015,33(4):627-640.
[18]FENG M,MAO S,JIANG T.Boost:Base station on off swi-tching strategy for energy efficient massive mimo hetnets[C]// IEEE INFOCOM 2016 The 35th Annual IEEE International Conference on Computer Communications.IEEE,2016:1-9.
[19]SAMARAKOON S,BENNIS M,SAAD W,et al.Opportunistic sleep mode strategies in wireless small cell networks[C]//2014 IEEE International Conference on Communications(ICC).IEEE,2014:2707-2712.
[20]SALEM F E,ALTMAN Z,GATI A,et al.Reinforcement lear-ning approach for advanced sleep modes management in 5G networks[C]//2018 IEEE 88th Vehicular Technology Conference (VTC-Fall).IEEE,2018:1-5.
[21]EL-AMINE A,ITURRALDE M,HASSAN H A H,et al.A distributed Q-Learning approach for adaptive sleep modes in 5G networks[C]//2019 IEEE Wireless Communications and Networking Conference (WCNC).IEEE,2019:1-6.
[22]ZHANG Y T.A Deep Reinforcement Learning based Dynamic C-RAN Resource Allocation Method[J].Journal of Chinese Computer Systems,2021,42(1):132-136.
[1] 黄丽, 朱焱, 李春平.
基于异构网络表征学习的作者学术行为预测
Author’s Academic Behavior Prediction Based on Heterogeneous Network Representation Learning
计算机科学, 2022, 49(9): 76-82. https://doi.org/10.11896/jsjkx.210900078
[2] 熊丽琴, 曹雷, 赖俊, 陈希亮.
基于值分解的多智能体深度强化学习综述
Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization
计算机科学, 2022, 49(9): 172-182. https://doi.org/10.11896/jsjkx.210800112
[3] 于滨, 李学华, 潘春雨, 李娜.
基于深度强化学习的边云协同资源分配算法
Edge-Cloud Collaborative Resource Allocation Algorithm Based on Deep Reinforcement Learning
计算机科学, 2022, 49(7): 248-253. https://doi.org/10.11896/jsjkx.210400219
[4] 李梦菲, 毛莺池, 屠子健, 王瑄, 徐淑芳.
基于深度确定性策略梯度的服务器可靠性任务卸载策略
Server-reliability Task Offloading Strategy Based on Deep Deterministic Policy Gradient
计算机科学, 2022, 49(7): 271-279. https://doi.org/10.11896/jsjkx.210600040
[5] 谢万城, 李斌, 代玥玥.
空中智能反射面辅助边缘计算中基于PPO的任务卸载方案
PPO Based Task Offloading Scheme in Aerial Reconfigurable Intelligent Surface-assisted Edge Computing
计算机科学, 2022, 49(6): 3-11. https://doi.org/10.11896/jsjkx.220100249
[6] 洪志理, 赖俊, 曹雷, 陈希亮, 徐志雄.
基于遗憾探索的竞争网络强化学习智能推荐方法研究
Study on Intelligent Recommendation Method of Dueling Network Reinforcement Learning Based on Regret Exploration
计算机科学, 2022, 49(6): 149-157. https://doi.org/10.11896/jsjkx.210600226
[7] 李鹏, 易修文, 齐德康, 段哲文, 李天瑞.
一种基于深度学习的供热策略优化方法
Heating Strategy Optimization Method Based on Deep Learning
计算机科学, 2022, 49(4): 263-268. https://doi.org/10.11896/jsjkx.210300155
[8] 欧阳卓, 周思源, 吕勇, 谭国平, 张悦, 项亮亮.
基于深度强化学习的无信号灯交叉路口车辆控制
DRL-based Vehicle Control Strategy for Signal-free Intersections
计算机科学, 2022, 49(3): 46-51. https://doi.org/10.11896/jsjkx.210700010
[9] 蒲实, 赵卫东.
一种面向动态科研网络的社区检测算法
Community Detection Algorithm for Dynamic Academic Network
计算机科学, 2022, 49(1): 89-94. https://doi.org/10.11896/jsjkx.210100023
[10] 代珊珊, 刘全.
基于动作约束深度强化学习的安全自动驾驶方法
Action Constrained Deep Reinforcement Learning Based Safe Automatic Driving Method
计算机科学, 2021, 48(9): 235-243. https://doi.org/10.11896/jsjkx.201000084
[11] 成昭炜, 沈航, 汪悦, 王敏, 白光伟.
基于深度强化学习的无人机辅助弹性视频多播机制
Deep Reinforcement Learning Based UAV Assisted SVC Video Multicast
计算机科学, 2021, 48(9): 271-277. https://doi.org/10.11896/jsjkx.201000078
[12] 周仕承, 刘京菊, 钟晓峰, 卢灿举.
基于深度强化学习的智能化渗透测试路径发现
Intelligent Penetration Testing Path Discovery Based on Deep Reinforcement Learning
计算机科学, 2021, 48(7): 40-46. https://doi.org/10.11896/jsjkx.210400057
[13] 李贝贝, 宋佳芮, 杜卿芸, 何俊江.
DRL-IDS:基于深度强化学习的工业物联网入侵检测系统
DRL-IDS:Deep Reinforcement Learning Based Intrusion Detection System for Industrial Internet of Things
计算机科学, 2021, 48(7): 47-54. https://doi.org/10.11896/jsjkx.210400021
[14] 梁俊斌, 张海涵, 蒋婵, 王天舒.
移动边缘计算中基于深度强化学习的任务卸载研究进展
Research Progress of Task Offloading Based on Deep Reinforcement Learning in Mobile Edge Computing
计算机科学, 2021, 48(7): 316-323. https://doi.org/10.11896/jsjkx.200800095
[15] 王英恺, 王青山.
能量收集无线通信系统中基于强化学习的能量分配策略
Reinforcement Learning Based Energy Allocation Strategy for Multi-access Wireless Communications with Energy Harvesting
计算机科学, 2021, 48(7): 333-339. https://doi.org/10.11896/jsjkx.201100154
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!