计算机科学 ›› 2020, Vol. 47 ›› Issue (2): 233-238.doi: 10.11896/jsjkx.190100070
李丽,郑嘉利,王哲,袁源,石静
LI Li,ZHENG Jia-li,WANG Zhe,YUAN Yuan,SHI Jing
摘要: 针对现有的RFID室内定位算法的精度容易受到环境因素影响的问题,提出了一种基于异步优势动作评价(Asynchronous Advantage Actor-critic,A3C)的RFID室内定位算法。该算法的主要步骤为:1)将RFID的信号强度RSSI值作为输入值,多个线程子动作网络并行交互采样学习,利用子评价网络评价动作值的优劣,使模型不断优化,找到最优信号强度RSSI值,并训练定位模型;子线程网络定期将网络参数异步更新到全局网络上,全局网络最后输出参考标签的具体位置,同时训练得到异步优势动作评价定位模型。2)在线定位阶段,当待测目标进入待测区域时,记录待测目标的信号强度RSSI值,将其输入异步优势动作评价定位模型中,子线程网络从全局网络中获取最新定位信息,对待测目标进行定位,最后输出目标的具体位置。实验数据表明,基于异步优势动作评价的RFID室内定位算法与传统的基于向量机(Support Vector Machines,SVM)定位、基于极限学习机(Extreme Learning Machine,ELM)定位、基于多层神经网络定位(Multi-Layer Perceptron,MLP)的RFID室内定位算法相比,定位平均误差分别下降了66.114%,50.316%,44.494%;定位稳定性分别平均提高了59.733%,53.083%,43.748%。实验结果表明,基于异步优势动作评价的RFID室内定位算法在处理大量室内定位目标时具有较好的定位性能。
中图分类号:
[1]SHI J Y,QIN X L,WANG L.Gradient and Constant-game Based RFID Indoor Localization Algorithm[J].ComputerScience,2015,42(11):138-143. [2]ZHENG J,YANG Y,HE X,et al.Multiple-port reader antenna with three modes for UHF RFID applications[J].Electronics Letters,2018,54(5):264-266. [3]LIU K,ZHANG W,ZHANG W D,et al.A Wireless Positioning Method Based on Deep Neural Network[J].Computer Engineering,2016,42(7):82-85. [4]YANG Y N,XIA B,YUAN W,et al.Research on Ranging Algorithm Based on Convolution Neural Network[J].Journal of Chongqing University of Technology(Natural Science),2018(3):172-177. [5]WANG C,WU F,SHI Z,et al.Indoor positioning technique by combining RFID and particle swarm optimization-based back propagation neural network[J].Optik - International Journal for Light and Electron Optics,2016,127(17):6839-6849. [6]WANG C,SHI Z,WU F,et al.An RFID indoor positioning system by using Particle Swarm Optimization-based Artificial Neural Network[C]∥2016 International Conference on Audio.Language and Image Processing(ICALIP).IEEE Computer Society,2017:738-742. [7]KUNG H Y,CHAISIT S,PHUONG N T M.Optimization of an RFID location identification scheme based on the neural network[J].International Journal of Communication Systems,2015,28(4):625-644. [8]JIANG X,LIU J,CHEN Y,et al.Feature Adaptive Online Sequential Extreme Learning Machine for lifelong indoor localization[J].Neural Computing & Applications,2016,27(1):215-225. [9]LIU F,ZHONG D.GSOS-ELM:An RFID-Based Indoor Localization System Using GSO Method and Semi-Supervised Online Sequential ELM[J].Sensors,2018,18(7):1995. [10]GAO Z,MA Y,LIU K,et al.An Indoor Multi-tag Cooperative Localization Algorithm Based on NMDS for RFID[J].IEEE Sensors Journal,2017,17(7):2120-2128. [11] ZHAO Y,LIU K,MA Y,et al.Similarity Analysis-Based Indoor Localization Algorithm With Backscatter Information of Passive UHF RFID Tags[J].IEEE Sensors Journal,2016,17(99):1-1. [12]SUTTON R,BARTO A.Reinforcement Learning:An Introduction(second edition)[M].The MIT Press,2018. [13]MURRAY D G,MURRAY D G.A computational model for TensorFlow:an introduction[C]∥Proceesings of the 1st ACM SIGPLAN International Workshop on Machine Learning and Programming Language.New York:ACM,2017:1-7. [14]ABADI M.TensorFlow:learning functions at scale[J].Acm Sigplan Notices,2016,51(9):1-1. [15]SCHMIDHUBER J.Deep learning in neural networks:An overview[J].Neural Network,2015,61(5):85-117. [16]SIMONYAN K,ZISSERMAN A.Very Deep Convolutional Networks for Large-Scale Image Recognition[J].arXiv:1409.1556,2014. [17]SONG R,LEWIS F,WEI Q,et al.Multiple actor-critic struc-tures for continuous-time optimal control using input-output data[J].IEEE Transactions on Neural Networks and Learning Systems,2015,26(4):851-865. [18]MNIH V,BADIA A P,MIRZA M,et al.Asynchronous Methods for Deep Reinforcement Learning[J].arXiv:1602.01783v2,2016. [19]BURTON A,PARIKH T,MASCARENHAS S,et al.Driver identification and authentication with active behavior modeling[C]∥12th International Conference on Network and Service Management(CNSM).IEEE Computer Society,2017:388-393. [20]ALARIFI A,ALSALMAN A M,ALSALEH M,et al.Ultra Wideband Indoor Positioning Technologies:Analysis and Recent Advances[J].IEEE Sensors,2016,16(5):1-36. [21]ZHAI X,ALI A A S,AMIRA A,et al.MLP Neural Network Based Gas Classification System on Zynq SoC[J].IEEE Access,2017,4(99):8138-8146. |
[1] | 邵子灏, 杨世宇, 马国杰. 室内信息服务的基础——低成本定位技术研究综述 Foundation of Indoor Information Services:A Survey of Low-cost Localization Techniques 计算机科学, 2022, 49(9): 228-235. https://doi.org/10.11896/jsjkx.210900260 |
[2] | 刘兴光, 周力, 刘琰, 张晓瀛, 谭翔, 魏急波. 基于边缘智能的频谱地图构建与分发方法 Construction and Distribution Method of REM Based on Edge Intelligence 计算机科学, 2022, 49(9): 236-241. https://doi.org/10.11896/jsjkx.220400148 |
[3] | 唐清华, 王玫, 唐超尘, 刘鑫, 梁雯. 基于M2M相遇区的PDR室内定位方法 PDR Indoor Positioning Method Based on M2M Encounter Region 计算机科学, 2022, 49(9): 283-287. https://doi.org/10.11896/jsjkx.210800270 |
[4] | 熊丽琴, 曹雷, 赖俊, 陈希亮. 基于值分解的多智能体深度强化学习综述 Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization 计算机科学, 2022, 49(9): 172-182. https://doi.org/10.11896/jsjkx.210800112 |
[5] | 史殿习, 赵琛然, 张耀文, 杨绍武, 张拥军. 基于多智能体强化学习的端到端合作的自适应奖励方法 Adaptive Reward Method for End-to-End Cooperation Based on Multi-agent Reinforcement Learning 计算机科学, 2022, 49(8): 247-256. https://doi.org/10.11896/jsjkx.210700100 |
[6] | 袁唯淋, 罗俊仁, 陆丽娜, 陈佳星, 张万鹏, 陈璟. 智能博弈对抗方法:博弈论与强化学习综合视角对比分析 Methods in Adversarial Intelligent Game:A Holistic Comparative Analysis from Perspective of Game Theory and Reinforcement Learning 计算机科学, 2022, 49(8): 191-204. https://doi.org/10.11896/jsjkx.220200174 |
[7] | 于滨, 李学华, 潘春雨, 李娜. 基于深度强化学习的边云协同资源分配算法 Edge-Cloud Collaborative Resource Allocation Algorithm Based on Deep Reinforcement Learning 计算机科学, 2022, 49(7): 248-253. https://doi.org/10.11896/jsjkx.210400219 |
[8] | 李梦菲, 毛莺池, 屠子健, 王瑄, 徐淑芳. 基于深度确定性策略梯度的服务器可靠性任务卸载策略 Server-reliability Task Offloading Strategy Based on Deep Deterministic Policy Gradient 计算机科学, 2022, 49(7): 271-279. https://doi.org/10.11896/jsjkx.210600040 |
[9] | 周楚霖, 陈敬东, 黄凡. 基于无迹粒子滤波的WiFi-PDR融合室内定位技术 WiFi-PDR Fusion Indoor Positioning Technology Based on Unscented Particle Filter 计算机科学, 2022, 49(6A): 606-611. https://doi.org/10.11896/jsjkx.210700108 |
[10] | 谢万城, 李斌, 代玥玥. 空中智能反射面辅助边缘计算中基于PPO的任务卸载方案 PPO Based Task Offloading Scheme in Aerial Reconfigurable Intelligent Surface-assisted Edge Computing 计算机科学, 2022, 49(6): 3-11. https://doi.org/10.11896/jsjkx.220100249 |
[11] | 洪志理, 赖俊, 曹雷, 陈希亮, 徐志雄. 基于遗憾探索的竞争网络强化学习智能推荐方法研究 Study on Intelligent Recommendation Method of Dueling Network Reinforcement Learning Based on Regret Exploration 计算机科学, 2022, 49(6): 149-157. https://doi.org/10.11896/jsjkx.210600226 |
[12] | 郭雨欣, 陈秀宏. 融合BERT词嵌入表示和主题信息增强的自动摘要模型 Automatic Summarization Model Combining BERT Word Embedding Representation and Topic Information Enhancement 计算机科学, 2022, 49(6): 313-318. https://doi.org/10.11896/jsjkx.210400101 |
[13] | 范静宇, 刘全. 基于随机加权三重Q学习的异策略最大熵强化学习算法 Off-policy Maximum Entropy Deep Reinforcement Learning Algorithm Based on RandomlyWeighted Triple Q -Learning 计算机科学, 2022, 49(6): 335-341. https://doi.org/10.11896/jsjkx.210300081 |
[14] | 张佳能, 李辉, 吴昊霖, 王壮. 一种平衡探索和利用的优先经验回放方法 Exploration and Exploitation Balanced Experience Replay 计算机科学, 2022, 49(5): 179-185. https://doi.org/10.11896/jsjkx.210300084 |
[15] | 李鹏, 易修文, 齐德康, 段哲文, 李天瑞. 一种基于深度学习的供热策略优化方法 Heating Strategy Optimization Method Based on Deep Learning 计算机科学, 2022, 49(4): 263-268. https://doi.org/10.11896/jsjkx.210300155 |
|