基于深度强化学习的无人机辅助弹性视频多播机制

doi:10.11896/jsjkx.201000078

Computer Science ›› 2021, Vol. 48 ›› Issue (9): 271-277.doi: 10.11896/jsjkx.201000078

• Computer Network • Previous Articles Next Articles

Deep Reinforcement Learning Based UAV Assisted SVC Video Multicast

CHENG Zhao-wei^1,2, SHEN Hang^1,2, WANG Yue¹, WANG Min¹, BAI Guang-wei¹

1 College of Computer Science and Technology,Nanjing Tech University,Nanjing 211816,China
2 State Key Laboratory for Novel Software Technology,Nanjing University,Nanjing 210093,China

Received:2020-10-14 Revised:2021-03-15 Online:2021-09-15 Published:2021-09-10
About author:CHENG Zhao-wei,born in 1995,postgraduate.His main research interests include space-air-ground integrated networks and so on.
SHEN Hang,born in 1984,Ph.D,asso-ciate professor.His main research in-terests include network slicing and space-air-ground integrated networks.
Supported by:
National Natural Science Foundation of China (61502230),Natural Science Foundation of Jiangsu Province(BK20201357),Six Talent Peaks Project in Jiangsu Province (RJFW-020),State Key Laboratory Program for Novel Software Technology(KFKT2017B21),Postgraduate Research & Practice Innovation Program of Jiangsu Province (KYCX20_1079,SJCX20_0351) and University-Industry Collaborative Education Program of the Ministry of Education(201902182003)

Abstract

Abstract: In this paper,a flexible video multicast mechanism assisted by the UAV base station is proposed.In combination with SVC encoding,the dynamic deployment and resource allocation of UAV are considered jointly in order to maximize the overall number of enhancement layers received by users.The traditional heuristic algorithm is difficult to deal with the complexity of user movement,considering that the user movement within the range of macro station will change the network topology.To this end,the DDPG algorithm based on deep reinforcement learning is used to train the neural network to decide the optimal location and bandwidth allocation proportion of UAV.After the model converges,the learning agent can find the optimal UAV deployment and bandwidth allocation strategy in a short time.The simulation results show that the proposed scheme achieves the expected goal and is superior to the existing scheme based on Q-learning.

Key words: Deep reinforcement learning, Mobile Internet, Multicast, Scalable video coding(SVC), Unmanned aerial vehicles

CLC Number:

TP393

CHENG Zhao-wei, SHEN Hang, WANG Yue, WANG Min, BAI Guang-wei. Deep Reinforcement Learning Based UAV Assisted SVC Video Multicast[J].Computer Science, 2021, 48(9): 271-277.

References

[1]ARANITI G,CONDOLUCI M,SCOPELLITI P,et al.Multicasting over emerging 5G networks:Challenges and perspectives[J].IEEE Network,2017,31(2):80-89.
[2]AGIWAL M,ROY A,SAXENA N.Next generation 5G wireless networks:A comprehensive survey[J].IEEE Communications Surveys & Tutorials,2016,18(3):1617-1655.
[3]GHOSH A,MANGALVEDHE N,RATASUK R,et al.Heterogeneous cellular networks:From theory to practice[J].IEEE Communications Magazine,2012,50(6):54-64.
[4]BOR-YALINIZ R I,EL-KEYI A,YANIKOMEROGLU H.Effi-cient 3-D placement of an aerial base station in next generation cellular networks[C]//2016 IEEE International Conference on Communications (ICC).IEEE,2016:1-5.
[5]GUO W,DEVINE C,WANG S.Performance analysis of micro unmanned airborne communication relays for cellular networks[C]//2014 9th International Symposium on Communication Systems,Networks & Digital Sign (CSNDSP).IEEE,2014:658-663.
[6]MOZAFFARI M,SAAD W,BENNIS M,et al.Drone small cells in the clouds:Design,deployment and performance analysis[C]//2015 IEEE Global Communications Conference (GLOBECOM).IEEE,2015:1-6.
[7]BOR-YALINIZ I,YANIKOMEROGLU H.The new frontier in RAN heterogeneity:Multi-tier drone-cells[J].IEEE Communications Magazine,2016,54(11):48-55.
[8]DERUYCK M,WYCKMANS J,MARTENS L,et al.Emergency ad-hoc networks by using drone mounted base stations for a disaster scenario[C]//2016 IEEE 12th International Conference on Wireless and Mobile Computing,Networking and Communications (WiMob).IEEE,2016:1-7.
[9]KALANTARI E,BOR-YALINIZ I,YONGACOGLU A,et al.User association and bandwidth allocation for terrestrial and aerial base stations with backhaul considerations[C]//2017 IEEE 28th Annual International Symposium on Personal,Indoor,and Mobile Radio Communications (PIMRC).IEEE,2017:1-6.
[10]PENG H,SHEN X.Multi-agent reinforcement learning based resource management in MEC- and UAV-assisted vehicular networks[C]//IEEE Journal on Selected Areas in Communications.2021:131-141.
[11]WU H,LYU F,ZHOU C,et al.Optimal UAV caching and tra-jectory in aerial-assisted vehicular networks:A learning-based approach[C]// IEEE Journal on Selected Areas in Communications.2020:2783-2797.
[12]CHENG N,LYU F,QUAN W,et al.Space/aerial-assisted computing offloading for IoT applications:A learning-based approach[J].IEEE Journal on Selected Areas in Communications,2019,37(5):1117-1129.
[13]ZHOU C,WU W,HE H,et al.Delay-aware iot task scheduling in space-air-ground integrated network[C]// IEEE GLOBECOM.2019:1-6.
[14]LILLICRAP T P,HUNT J J,PRITZEL A,et al.Continuouscontrol with deep reinforcement learning[J].arXiv:1509.02971,2015.
[15]StackExange.Implementing Ornstein-Uhlenbeck in Matlab[OL].(2017-09-22) [2020-05-20].https://math.stackexchange.com/questions/1287634/implementing-ornstein-uhlenbeck-in-matlab.
[16]ROTA BULÒ S,PORZI L,KONTSCHIEDER P.In-place activated batchnorm for memory-optimized training of dnns[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:5639-5647.
[17]GLOROT X,BORDES A,BENGIO Y.Deep sparse rectifierneural networks[C]//Proceedings of the Fourteenth International Conference on Artificial Intelligence Andstatistics.2011:315-323.
[18]BA J L,KIROS J R,HINTON G E.Layer normalization[J].arXiv:1607.06450,2016.
[19]MNIH V,BADIA A P,MIRZA M,et al.Asynchronous methods for deep reinforcement learning[C]//International Conference on Machine Learning.2016:1928-1937.
[20]MNIH V,KAVUKCUOGLU K,SILVER D,et al.Playing atari with deep reinforcement learning[J].arXiv:1312.5602,2013.

Related Articles 15

[1]	LIU Xin, WANG Jun, SONG Qiao-feng, LIU Jia-hao. Collaborative Multicast Proactive Caching Scheme Based on AAE [J]. Computer Science, 2022, 49(9): 260-267.
[2]	YU Bin, LI Xue-hua, PAN Chun-yu, LI Na. Edge-Cloud Collaborative Resource Allocation Algorithm Based on Deep Reinforcement Learning [J]. Computer Science, 2022, 49(7): 248-253.
[3]	LI Meng-fei, MAO Ying-chi, TU Zi-jian, WANG Xuan, XU Shu-fang. Server-reliability Task Offloading Strategy Based on Deep Deterministic Policy Gradient [J]. Computer Science, 2022, 49(7): 271-279.
[4]	XIE Wan-cheng, LI Bin, DAI Yue-yue. PPO Based Task Offloading Scheme in Aerial Reconfigurable Intelligent Surface-assisted Edge Computing [J]. Computer Science, 2022, 49(6): 3-11.
[5]	HONG Zhi-li, LAI Jun, CAO Lei, CHEN Xi-liang, XU Zhi-xiong. Study on Intelligent Recommendation Method of Dueling Network Reinforcement Learning Based on Regret Exploration [J]. Computer Science, 2022, 49(6): 149-157.
[6]	LI Peng, YI Xiu-wen, QI De-kang, DUAN Zhe-wen, LI Tian-rui. Heating Strategy Optimization Method Based on Deep Learning [J]. Computer Science, 2022, 49(4): 263-268.
[7]	OUYANG Zhuo, ZHOU Si-yuan, LYU Yong, TAN Guo-ping, ZHANG Yue, XIANG Liang-liang. DRL-based Vehicle Control Strategy for Signal-free Intersections [J]. Computer Science, 2022, 49(3): 46-51.
[8]	DAI Shan-shan, LIU Quan. Action Constrained Deep Reinforcement Learning Based Safe Automatic Driving Method [J]. Computer Science, 2021, 48(9): 235-243.
[9]	LIANG Jun-bin, ZHANG Hai-han, JIANG Chan, WANG Tian-shu. Research Progress of Task Offloading Based on Deep Reinforcement Learning in Mobile Edge Computing [J]. Computer Science, 2021, 48(7): 316-323.
[10]	WANG Ying-kai, WANG Qing-shan. Reinforcement Learning Based Energy Allocation Strategy for Multi-access Wireless Communications with Energy Harvesting [J]. Computer Science, 2021, 48(7): 333-339.
[11]	ZHOU Shi-cheng, LIU Jing-ju, ZHONG Xiao-feng, LU Can-ju. Intelligent Penetration Testing Path Discovery Based on Deep Reinforcement Learning [J]. Computer Science, 2021, 48(7): 40-46.
[12]	LI Bei-bei, SONG Jia-rui, DU Qing-yun, HE Jun-jiang. DRL-IDS:Deep Reinforcement Learning Based Intrusion Detection System for Industrial Internet of Things [J]. Computer Science, 2021, 48(7): 47-54.
[13]	FAN Jia-kuan, WANG Hao-yue, ZHAO Sheng-yu, ZHOU Tian-yi, WANG Wei. Data-driven Methods for Quantitative Assessment and Enhancement of Open Source Contributions [J]. Computer Science, 2021, 48(5): 45-50.
[14]	FAN Yan-fang, YUAN Shuang, CAI Ying, CHEN Ruo-yu. Deep Reinforcement Learning-based Collaborative Computation Offloading Scheme in VehicularEdge Computing [J]. Computer Science, 2021, 48(5): 270-276.
[15]	HUANG Zhi-yong, WU Hao-lin, WANG Zhuang, LI Hui. DQN Algorithm Based on Averaged Neural Network Parameters [J]. Computer Science, 2021, 48(4): 223-228.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Deep Reinforcement Learning Based UAV Assisted SVC Video Multicast

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0