Computer Science ›› 2020, Vol. 47 ›› Issue (2): 233-238.doi: 10.11896/jsjkx.190100070

• Computer Network • Previous Articles     Next Articles

RFID Indoor Positioning Algorithm Based on Asynchronous Advantage Actor-Critic

LI Li,ZHENG Jia-li,WANG Zhe,YUAN Yuan,SHI Jing   

  1. (School of Computer,Electronics and Information,Guangxi University,Nanning 530004,China)1;
    (Guangxi Key Laboratory of Multimedia Communications and Network Technology,Nanning 530004,China)2
  • Received:2019-01-10 Online:2020-02-15 Published:2020-03-18
  • About author:LI Li,born in 1994,postgraduate.Her main research interests include information processing and communication networks,reinforcement learning and internet of things;ZHENG Jia-li,born in 1979,professor.His main research interests include internet of things,RFID and artificial intelligence.
  • Supported by:
    This work was supported by the National Natural Science Foundation of China (61761004).

Abstract: In view of the fact that the accuracy of existing RFID indoor positioning algorithm is easily affected by environment factors and the robustness is not strong,this paper proposed an RFID indoor positioning algorithm based on asynchronous advantage actor-critic (A3C).The main steps of the algorithm are as follows.Firstly,the RSSI value of RFID signal strength is used as the input value.The multi-thread sub-action network parallel interactive sampling learning,and the sub-evaluation network evaluates the advantage and disadvantage of the action value,so that the model is continuously optimized to find the best signal strength RSSI and trains the positioning model.The sub-thread network updates the network parameters to the global network on a regular basis,and the global network finally outputs the specific location of the reference tag,at the same time the asynchronous advantage actor-critic positioning model is trained.Secondly,in the online positioning stage,when the target to be tested enters the area to be tested,the signal strength RSSI value of the object to be tested is recorded and input into the asynchronous advantage actor-critic positioning model.The sub-thread network obtains the latest positioning information from the global network,locates the side target,and finally outputs the specific position of the target.RFID indoor positioning algorithm based on asynchronous advantage actor-critic was compared with the traditional RFID indoor positioning algorithm based on Support Vector Machines (SVM) positioning,Extreme Learning Machine (ELM) positioning,and Multi-Layer Perceptron positioning (MLP).Experiment results show that the mean positioning error of the proposed algorithm is respectively decreased by 66.114%,50.316% and 44.494%; the average positioning stability is respectively increased by 59.733%,53.083% and 43.748%.The experiment results show that the proposed RFID indoor positioning algorithm based on asynchronous advantage actor-critic has better positioning performance when dealing with a large number of indoor positioning targets.

Key words: Asynchronous advantage actor-critic, Indoor positioning, Reinforcement learning, RFID, RSSI

CLC Number: 

  • TP301.6
[1]SHI J Y,QIN X L,WANG L.Gradient and Constant-game Based RFID Indoor Localization Algorithm[J].ComputerScience,2015,42(11):138-143.
[2]ZHENG J,YANG Y,HE X,et al.Multiple-port reader antenna with three modes for UHF RFID applications[J].Electronics Letters,2018,54(5):264-266.
[3]LIU K,ZHANG W,ZHANG W D,et al.A Wireless Positioning Method Based on Deep Neural Network[J].Computer Engineering,2016,42(7):82-85.
[4]YANG Y N,XIA B,YUAN W,et al.Research on Ranging Algorithm Based on Convolution Neural Network[J].Journal of Chongqing University of Technology(Natural Science),2018(3):172-177.
[5]WANG C,WU F,SHI Z,et al.Indoor positioning technique by combining RFID and particle swarm optimization-based back propagation neural network[J].Optik - International Journal for Light and Electron Optics,2016,127(17):6839-6849.
[6]WANG C,SHI Z,WU F,et al.An RFID indoor positioning system by using Particle Swarm Optimization-based Artificial Neural Network[C]∥2016 International Conference on Audio.Language and Image Processing(ICALIP).IEEE Computer Society,2017:738-742.
[7]KUNG H Y,CHAISIT S,PHUONG N T M.Optimization of an RFID location identification scheme based on the neural network[J].International Journal of Communication Systems,2015,28(4):625-644.
[8]JIANG X,LIU J,CHEN Y,et al.Feature Adaptive Online Sequential Extreme Learning Machine for lifelong indoor localization[J].Neural Computing & Applications,2016,27(1):215-225.
[9]LIU F,ZHONG D.GSOS-ELM:An RFID-Based Indoor Localization System Using GSO Method and Semi-Supervised Online Sequential ELM[J].Sensors,2018,18(7):1995.
[10]GAO Z,MA Y,LIU K,et al.An Indoor Multi-tag Cooperative Localization Algorithm Based on NMDS for RFID[J].IEEE Sensors Journal,2017,17(7):2120-2128.
[11] ZHAO Y,LIU K,MA Y,et al.Similarity Analysis-Based Indoor Localization Algorithm With Backscatter Information of Passive UHF RFID Tags[J].IEEE Sensors Journal,2016,17(99):1-1.
[12]SUTTON R,BARTO A.Reinforcement Learning:An Introduction(second edition)[M].The MIT Press,2018.
[13]MURRAY D G,MURRAY D G.A computational model for TensorFlow:an introduction[C]∥Proceesings of the 1st ACM SIGPLAN International Workshop on Machine Learning and Programming Language.New York:ACM,2017:1-7.
[14]ABADI M.TensorFlow:learning functions at scale[J].Acm Sigplan Notices,2016,51(9):1-1.
[15]SCHMIDHUBER J.Deep learning in neural networks:An overview[J].Neural Network,2015,61(5):85-117.
[16]SIMONYAN K,ZISSERMAN A.Very Deep Convolutional Networks for Large-Scale Image Recognition[J].arXiv:1409.1556,2014.
[17]SONG R,LEWIS F,WEI Q,et al.Multiple actor-critic struc-tures for continuous-time optimal control using input-output data[J].IEEE Transactions on Neural Networks and Learning Systems,2015,26(4):851-865.
[18]MNIH V,BADIA A P,MIRZA M,et al.Asynchronous Methods for Deep Reinforcement Learning[J].arXiv:1602.01783v2,2016.
[19]BURTON A,PARIKH T,MASCARENHAS S,et al.Driver identification and authentication with active behavior modeling[C]∥12th International Conference on Network and Service Management(CNSM).IEEE Computer Society,2017:388-393.
[20]ALARIFI A,ALSALMAN A M,ALSALEH M,et al.Ultra Wideband Indoor Positioning Technologies:Analysis and Recent Advances[J].IEEE Sensors,2016,16(5):1-36.
[21]ZHAI X,ALI A A S,AMIRA A,et al.MLP Neural Network Based Gas Classification System on Zynq SoC[J].IEEE Access,2017,4(99):8138-8146.
[1] LIU Xing-guang, ZHOU Li, LIU Yan, ZHANG Xiao-ying, TAN Xiang, WEI Ji-bo. Construction and Distribution Method of REM Based on Edge Intelligence [J]. Computer Science, 2022, 49(9): 236-241.
[2] TANG Qing-hua, WANG Mei, TANG Chao-chen, LIU Xin, LIANG Wen. PDR Indoor Positioning Method Based on M2M Encounter Region [J]. Computer Science, 2022, 49(9): 283-287.
[3] YUAN Wei-lin, LUO Jun-ren, LU Li-na, CHEN Jia-xing, ZHANG Wan-peng, CHEN Jing. Methods in Adversarial Intelligent Game:A Holistic Comparative Analysis from Perspective of Game Theory and Reinforcement Learning [J]. Computer Science, 2022, 49(8): 191-204.
[4] SHI Dian-xi, ZHAO Chen-ran, ZHANG Yao-wen, YANG Shao-wu, ZHANG Yong-jun. Adaptive Reward Method for End-to-End Cooperation Based on Multi-agent Reinforcement Learning [J]. Computer Science, 2022, 49(8): 247-256.
[5] YU Bin, LI Xue-hua, PAN Chun-yu, LI Na. Edge-Cloud Collaborative Resource Allocation Algorithm Based on Deep Reinforcement Learning [J]. Computer Science, 2022, 49(7): 248-253.
[6] LI Meng-fei, MAO Ying-chi, TU Zi-jian, WANG Xuan, XU Shu-fang. Server-reliability Task Offloading Strategy Based on Deep Deterministic Policy Gradient [J]. Computer Science, 2022, 49(7): 271-279.
[7] ZHOU Chu-lin, CHEN Jing-dong, HUANG Fan. WiFi-PDR Fusion Indoor Positioning Technology Based on Unscented Particle Filter [J]. Computer Science, 2022, 49(6A): 606-611.
[8] GUO Yu-xin, CHEN Xiu-hong. Automatic Summarization Model Combining BERT Word Embedding Representation and Topic Information Enhancement [J]. Computer Science, 2022, 49(6): 313-318.
[9] FAN Jing-yu, LIU Quan. Off-policy Maximum Entropy Deep Reinforcement Learning Algorithm Based on RandomlyWeighted Triple Q -Learning [J]. Computer Science, 2022, 49(6): 335-341.
[10] XIE Wan-cheng, LI Bin, DAI Yue-yue. PPO Based Task Offloading Scheme in Aerial Reconfigurable Intelligent Surface-assisted Edge Computing [J]. Computer Science, 2022, 49(6): 3-11.
[11] HONG Zhi-li, LAI Jun, CAO Lei, CHEN Xi-liang, XU Zhi-xiong. Study on Intelligent Recommendation Method of Dueling Network Reinforcement Learning Based on Regret Exploration [J]. Computer Science, 2022, 49(6): 149-157.
[12] ZHANG Jia-neng, LI Hui, WU Hao-lin, WANG Zhuang. Exploration and Exploitation Balanced Experience Replay [J]. Computer Science, 2022, 49(5): 179-185.
[13] LI Peng, YI Xiu-wen, QI De-kang, DUAN Zhe-wen, LI Tian-rui. Heating Strategy Optimization Method Based on Deep Learning [J]. Computer Science, 2022, 49(4): 263-268.
[14] OUYANG Zhuo, ZHOU Si-yuan, LYU Yong, TAN Guo-ping, ZHANG Yue, XIANG Liang-liang. DRL-based Vehicle Control Strategy for Signal-free Intersections [J]. Computer Science, 2022, 49(3): 46-51.
[15] ZHOU Qin, LUO Fei, DING Wei-chao, GU Chun-hua, ZHENG Shuai. Double Speedy Q-Learning Based on Successive Over Relaxation [J]. Computer Science, 2022, 49(3): 239-245.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!