Computer Science ›› 2021, Vol. 48 ›› Issue (7): 333-339.doi: 10.11896/jsjkx.201100154

• Computer Network • Previous Articles     Next Articles

Reinforcement Learning Based Energy Allocation Strategy for Multi-access Wireless Communications with Energy Harvesting

WANG Ying-kai, WANG Qing-shan   

  1. School of Mathematics,Hefei University of Technology,Hefei 230001,China
  • Received:2020-11-23 Revised:2021-02-09 Online:2021-07-15 Published:2021-07-02
  • About author:WANG Ying-kai,born in 1996,postgraduate.His main research interests include reinforcement learning and wireless communication.(2019111237@mail.hfut.edu.cn)
    WANG Qing-shan,born in 1973,Ph.D supervisor,is a member of China Computer Federation.His main research interests include edge computing and gesture recognition.
  • Supported by:
    National Natural Science Foundation of China(61571179).

Abstract: Due to the increasing popularization of the Internet of Things (IoT),the requirements for the power that can be used by the terminal equipment of the IoT are also constantly improving.Energy harvesting technology is a promising solution to overcome equipment energy shortages by generating renewable energy.Considering the uncertainty of renewable energy in the unknown environment,the terminal equipment of the IoT needs a reasonable and effective energy allocation strategy to ensure the continuous and stable operation of the system.In this paper,a DQN-based deep reinforcement learning energy allocation strategy is proposed,which uses DQN algorithm to directly interact with the unknown environment to approach the optimal energy allocation strategy without relying on the prior knowledge of the environment.Moreover,a pre-training algorithm is proposed to optimize the initialization state and learning rate of the strategy based on the characteristics of reinforcement learning and time-inva-riant system.The simulation results under different channel data conditions show that the energy allocation strategy proposed in this paper has better performance than the existing strategy under different channel conditions,and has strong variable scene learning ability.

Key words: Deep reinforcement learning, Energy harvesting, Markov decision process, Resource allocation, Wireless communication

CLC Number: 

  • TP391
[1]YANG Z,XU W,PAN Y,et al.Energy Efficient Resource Allocation in Machine-to-Machine Communications with Multiple Access and Energy Harvesting for IoT[J].IEEE Internet of Things Journal,2017,5(1):229-245.
[2]ULUKUS S,AYLIN Y,ELZA E,et al.Energy HarvestingWireless Communications:A Review of Recent Advances[J].IEEE Journal on Selected Areas in Communications,2015,33(3):360-381.
[3]OZEL O,TUTUNCUOGLU K,ULUKUS S,et al.Fundamental Limits of Energy Harvesting Communications[J].IEEE Communications Magazine,2015,53(4):126-132.
[4]ZHANG L L,XIONG K,ZHANG Y.UAV-assisted WirelessEnergy Harvesting Fog Computing Network Optimization Method[J].Journal of Software,2019,30(1):9-17.
[5]ABU B,JOSIAH H.Making sense of intermittent energy harvesting[C]//Conference on Embedded Networked Sensor Systems.ACM,2018:32-37.
[6]THUC T K,HOSSAIN E,TABASSUM H.Downlink PowerControl in Two-Tier Cellular Networks with Energy-Harvesting Small Cells as Stochastic Games[J].IEEE Transactions on Communications,2015,63(12):5267-5282.
[7]KU M L,LI W,CHEN Y,et al.Advances in Energy Harvesting Communications:Past,Present,and Future Challenges[J].IEEE Communications Surveys & Tutorials,2017,18(2):1384-1412.
[8]YANG J,ULUKUS S.Optimal Packet Scheduling in an Energy Harvesting Communication System[J].IEEE Transactions on Communications,2010,60(1):220-230.
[9]TUTUNCUOGLU K,YENER A.Optimum Transmission Policies for Battery Limited Energy Harvesting Nodes[J].IEEE Transactions on Wireless Communications,2010,11(3):1180-1189.
[10]YUAN F,ZHANG Q T,JIN S,et al.Optimal Harvest-Use-Store Strategy for Energy Harvesting Wireless Systems[J].IEEE Transactions on Wireless Communications,2015,14(2):698-710.
[11]CHI K K,XU X C,WEI X C.Minimal Base Stations Deploy-ment Scheme Satisfying Node Throughput Requirement in Radio Frequency Energy Harvesting Wireless Sensor Networks[J].Computer Science,2018,45(S1):332-336.
[12]TIAN X Z,YAO C,ZHAO C,et al.5G Network oriented Mobile Edge Computation Offloading Strategy[J].Computer Scien-ce,2020,47(S2):286-290.
[13]BLASCO P,GUNDUZ D,DOHLER M.A Learning Theoretic Approach to Energy Harvesting Communication System Optimization[J].IEEE Transactions on Wireless Communications,2013,12(4):1872-1882.
[14]OZEL O,TUTUNCUOGLU K,YANG J,et al.Transmissionwith Energy Harvesting Nodes in Fading Wireless Channels:Optimal Policies[J].IEEE Journal on Selected Areas in Communications,2011,29(8):1732-1743.
[15]AMIRNAVAEI F,DONG M.Online Power Control Optimization for Wireless Transmission with Energy Harvesting and Storage[J].IEEE Transactions on Wireless Communications,2016,15(7):4888-4901.
[16]SUTTON R,BARTO A.Reinforcement Learning:An Introduction[M].MIT Press,1998.
[17]CHU M,LI H,LIAO X,et al.Reinforcement Learning based Multi-Access Control and Battery Prediction with Energy Harvesting in IoT Systems[J].IEEE Internet of Things Journal,2019,6(2):2009-2020.
[18]FRANCESCO F,BHARATHAN B,RAJESH G.Scaling Configuration of Energy Harvesting Sensors with Reinforcement Learning[C]//Conference on Embedded Networked Sensor Systems.ACM,2018:7-13.
[19]JIA Z G,WANG Z P,HU J T.Work-in-Progress:Q-Learning Based Routing for Transiently Powered Wireless Sensor Network[C]//International Conference on Hardware/Software Codesign and System Synthesis.ACM,2019:1-2.
[20]WEI Y,YU F R,SONG M,et al.User Scheduling and Resource Allocation in HetNets with Hybrid Energy Supply:An Actor-Critic Reinforcement Learning Approach[J].IEEE Transactions on Wireless Communications,2018,17(1):680-692.
[21]AOUDIA F A,GAUTIER M,BERDER O.RLMan:An Energy Manager based on Reinforcement Learning for Energy Harvesting Wireless Sensor Networks[J].IEEE Transactions on Green Communications & Networking,2018,2(2):408-417.
[22]CHU M,LI H,LIAO X,et al.Power Control in Energy Harvesting Multiple Access System with Reinforcement Learning[J].IEEE Internet of Things Journal,2019,6(5):9175-9186.
[1] GUO Peng-jun, ZHANG Jing-zhou, YANG Yuan-fan, YANG Shen-xiang. Study on Wireless Communication Network Architecture and Access Control Algorithm in Aircraft [J]. Computer Science, 2022, 49(9): 268-274.
[2] YU Bin, LI Xue-hua, PAN Chun-yu, LI Na. Edge-Cloud Collaborative Resource Allocation Algorithm Based on Deep Reinforcement Learning [J]. Computer Science, 2022, 49(7): 248-253.
[3] TANG Feng, FENG Xiang, YU Hui-qun. Multi-task Cooperative Optimization Algorithm Based on Adaptive Knowledge Transfer andResource Allocation [J]. Computer Science, 2022, 49(7): 254-262.
[4] LI Meng-fei, MAO Ying-chi, TU Zi-jian, WANG Xuan, XU Shu-fang. Server-reliability Task Offloading Strategy Based on Deep Deterministic Policy Gradient [J]. Computer Science, 2022, 49(7): 271-279.
[5] XIE Wan-cheng, LI Bin, DAI Yue-yue. PPO Based Task Offloading Scheme in Aerial Reconfigurable Intelligent Surface-assisted Edge Computing [J]. Computer Science, 2022, 49(6): 3-11.
[6] ZHOU Tian-qing, YUE Ya-li. Multi-Task and Multi-Step Computation Offloading in Ultra-dense IoT Networks [J]. Computer Science, 2022, 49(6): 12-18.
[7] QIU Xu, BIAN Hao-bu, WU Ming-xiao, ZHU Xiao-rong. Study on Task Offloading Algorithm for Internet of Vehicles on Highway Based on 5G MillimeterWave Communication [J]. Computer Science, 2022, 49(6): 25-31.
[8] XU Hao, CAO Gui-jun, YAN Lu, LI Ke, WANG Zhen-hong. Wireless Resource Allocation Algorithm with High Reliability and Low Delay for Railway Container [J]. Computer Science, 2022, 49(6): 39-43.
[9] HONG Zhi-li, LAI Jun, CAO Lei, CHEN Xi-liang, XU Zhi-xiong. Study on Intelligent Recommendation Method of Dueling Network Reinforcement Learning Based on Regret Exploration [J]. Computer Science, 2022, 49(6): 149-157.
[10] SHEN Jia-fang, QIAN Li-ping, YANG Chao. Non-orthogonal Multiple Access and Multi-dimension Resource Optimization in EH Relay NB-IoT Networks [J]. Computer Science, 2022, 49(5): 279-286.
[11] LI Peng, YI Xiu-wen, QI De-kang, DUAN Zhe-wen, LI Tian-rui. Heating Strategy Optimization Method Based on Deep Learning [J]. Computer Science, 2022, 49(4): 263-268.
[12] OUYANG Zhuo, ZHOU Si-yuan, LYU Yong, TAN Guo-ping, ZHANG Yue, XIANG Liang-liang. DRL-based Vehicle Control Strategy for Signal-free Intersections [J]. Computer Science, 2022, 49(3): 46-51.
[13] ZHOU Qin, LUO Fei, DING Wei-chao, GU Chun-hua, ZHENG Shuai. Double Speedy Q-Learning Based on Successive Over Relaxation [J]. Computer Science, 2022, 49(3): 239-245.
[14] PAN Yan-na, FENG Xiang, YU Hui-qun. Competitive-Cooperative Coevolution for Large Scale Optimization with Computation Resource Allocation Pool [J]. Computer Science, 2022, 49(2): 182-190.
[15] DAI Shan-shan, LIU Quan. Action Constrained Deep Reinforcement Learning Based Safe Automatic Driving Method [J]. Computer Science, 2021, 48(9): 235-243.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!