计算机科学 ›› 2023, Vol. 50 ›› Issue (11A): 221000088-5.doi: 10.11896/jsjkx.221000088

• 网络&通信 • 上一篇    下一篇

基于深度强化学习的无线异构网络中继决策研究

周天玉, 官铮   

  1. 云南大学信息学院 昆明 650500
  • 发布日期:2023-11-09
  • 通讯作者: 官铮(gz_627@sina.com)
  • 作者简介:(zty@mail.ynu.edu.cn)
  • 基金资助:
    国家自然科学基金(61761045);云南省科研基金资助项目(202201AT070167);云南大学科研项目(2021Y189)

Study on Relay Decision in Wireless Heterogeneous Networks Based on Deep ReinforcementLearning

ZHOU Tianyu, GUAN Zheng   

  1. School of Information Science & Engineering,Yunnan University,Kunming 650500,China
  • Published:2023-11-09
  • About author:ZHOU Tianyu,born in 1999,master.Her main research interests include deep reinforcement learning and intelligent mobile communication.
    GUAN Zheng,born in 1982,Ph.D,associate professor,master supervisor,is a member China Computer Federation.Her main research interests include wireless sensor networks,network access technology,and performance analysis and optimization of polling systems.
  • Supported by:
    National Natural Science Foundation of China(61761045),Research Foundation of Yunnan Province(202201AT070167) and Research Project of Yunnan University(2021Y189).

摘要: 在物联网大规模多用户场景中,远端节点需通过中继接入网络。为解决中继在异构接入技术环境下的自适应接入控制问题,提出一种基于深度强化学习的智能中继接入控制策略,将中继对远端用户数据的收发过程视为一个部分可观察马尔可夫决策过程,通过动态决策中继工作状态,以实现最大化系统的总吞吐量和节点公平性目标。首先,建立具有中继的无线异构网的上行链路模型,以提高系统总吞吐量为优化目标,建立中继动态决策优化模型;其次,构建含有LSTM隐藏层的深度Q网络(DQN)作为行为状态值函数,以优化系统总吞吐量。测试结果表明深度强化学习无线异构网络中继决策方案(DRL-RAP)可在确保原有用户服务质量的前提下,为远端用户提供网络接入,系统总吞吐量在原有网络基础上显著提高,吞吐量最大可提高30%。

关键词: 物联网, 无线异构网络, 深度强化学习, 中继智能决策, 神经网络

Abstract: For large-scale multi-user scenarios of the Internet of Things,remote nodes need to access the network through relay.In order to solve the adaptive access control problem of relay in heterogeneous access technology environment,an intelligent relay access control strategy based on deep reinforcement learning is proposed,which regards the transmission and reception process of relay to remote user data as a partially observable Markov decision process,and dynamically decides the relay working state to maximize the total system throughput and node fairness.Firstly,the uplink model of wireless heterogeneous network with relay is established.With the goal of improving the total throughput of the system,the dynamic decision optimization model of relay is established.Secondly,a deep Q network(DQN) with LSTM hidden layer is constructed as a behavior state value function to optimize the total system throughput.Test results show that DRL-RAP can provide network access for remote users on the premise of ensuring the original user’s quality of service.The total throughput of the system is significantly improved on the basis of the original network,and the maximum throughput can be increased by 30%.

Key words: Internet of Things, Wireless heterogeneous network, Deep reinforcement learning, Relay intelligent decision, Neural network

中图分类号: 

  • TN925
[1]KIM D,POPOVSKI P.Reliable uplink communication through couble cssociation in wireless weterogeneous networks[J].IEEE Wireless Communications Letters,2017,5(3):312-315.
[2]TOMITA T K,KOMURO N.Duty-Cycle Control AchievingHigh Packet Delivery Ratio in Heterogeneous Wireless Sensor Networks[C]//2019 IEEE 8th Global Conference on Consumer Electronics(GCCE).Osaka,Japan,2019:1164-1167.
[3]KOBAYASHI H,KAMEDA E,TERASHIMA Y,et al.A stra-tegy for AP selection with mutual concessions in sustainable he-terogeneous wireless networks[C]//2016 IEEE Region 10 Conference(TENCON 2016).IEEE,2016.
[4]XU C Q,WANG P,XIONG C S,et al.Pipeline network coding-based multipath data transfer in heterogeneous wireless networks[J].IEEE Transactions on Broadcasting,2016,63(2):376-390.
[5]GEN L,YU H W,GUO X X,et al.Joint access selection and bandwidth allocation algorithm supporting user requirements and preferences in heterogeneous wireless networks[J].IEEE Access,2019(7):23914-23929.
[6]ZARIN N,AGARWAL A.A centralized approach for load ba-lancing in heterogeneous wireless access network[C]//2018 IEEE Canadian Conference on Electrical & Computer Enginee-ring(CCECE).IEEE,2018.
[7]YU Y D,LIEW S C,WANG T T.Carrier-sense multiple access for heterogeneous wireless networks using deep reinforcement learning[C]//2019 IEEE Wireless Communications and Networking Conference Workshop(WCNCW).IEEE,2019.
[8]YU Y D,LIEW S C,WANG T T.Multi-agent deep reinforcement learning multiple access for heterogeneous wireless networks with imperfect channels[J].IEEE Transactions on Mobile Computing,2021,21(10):3718-3730.
[9]YE X W,YU Y D,FU L Q,et al.Multi-Channel Opportunistic Access for Heterogeneous Networks Based on Deep Reinforcement Learning[J].IEEE Transactions on Wireless Communications,2022,21(2):794-807.
[10]CHENG Q,WEI Z,YUAN J.Deep reinforcement learning-based spectrum allocation and power management for IAB networks[C]//2021 IEEE International Conference on Communications Workshops(ICC Workshops).IEEE,2021.
[11]KANG Z.Deep Reinforcement Learning-Based Dynamic MultiChannel Access for Heterogeneous Wireless Networks with DenseNet[C]//2021 IEEE/CIC International Conference on Communications in China(ICCC Workshops).IEEE,2021.
[12]ARUNACHALA C,BUCH S D,RAJAN S.Wireless bidirec-tional relaying using physical layer network coding with heterogeneous PSK modulation[J].IEEE Transactions on Vehicular Technology,2018,67(3):2335-2344.
[13]FAN J,YAO L,WANG B,et al.A relay-aided device-to-device-based load balancing scheme for multitier heterogeneous networks[J].IEEE Internet of Things Journal,2017,4(5):1537-1551.
[14]CHE N,LI Z J,JIANG S X.Relay node deployment algorithm in heterogeneous wireless networks[J] Journal of Computer Science,2016,39(5):905-918.
[15]KIM H,UJII T,UMEBAYASHI K.Relay nodes selection using
reinforcement learning[C]//2021 International Conference on Artificial Intelligence in Information and Communication(ICAIIC).2021.
[16]HUANG C,CHEN G,GONG Y,et al.Joint buffer-aided hybrid-duplex relay selection and power allocation for secure cognitive networks with double deep Q-network[J].IEEE Transactions on Cognitive Communications and Networking,2021,7(3):834-844.
[17]SU Y,LU X,ZHAO Y,et al.Cooperative communications with relay selection based on deep reinforcement learning in wireless sensor networks[J].IEEE Sensors Journal,2019,19(20):9561-9569.
[18]SHAN Y F,JIANG R,XU Y Y,et al.A power consumptionscheme for full duplex multi relay cooperative SWIPT network[J].Computer Science,2022,49(7):280-286.
[19]MO J,WALRAND J.Fair end-to-end window-based congestion control[C]//Performance and Control of Network Systems II.International Society for Optics and Photonics,1998.
[20]DONG H.Deep Reinforcement Learning:Foundation,Research and Application[M].Beijing:Electronic Industry Press,2021.
[21]WANG H N,LIU T,ZHANG Y Y,et al.A Review of deep reinforcement learning[J].Frontiers of Information Technology & Electronic Engineering,2020,21(12):63-82.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!