计算机科学 ›› 2021, Vol. 48 ›› Issue (7): 47-54.doi: 10.11896/jsjkx.210400021
所属专题: 人工智能安全
李贝贝, 宋佳芮, 杜卿芸, 何俊江
LI Bei-bei, SONG Jia-rui, DU Qing-yun, HE Jun-jiang
摘要: 近年来,工业物联网迅猛发展,在实现工业数字化、自动化、智能化的同时也带来了大量的网络威胁,且复杂、多样的工业物联网环境为网络入侵者创造了全新的攻击面。传统的入侵检测技术已无法满足当前工业物联网环境下的网络威胁发现需求。对此,文中提出了一种基于深度强化学习算法近端策略优化(Proximal Policy Optimization 2.0,PPO2)的工业物联网入侵检测系统。该系统将深度学习的感知能力和强化学习的决策能力相结合,以实现对工业物联网多种类型网络攻击的有效检测。首先,运用基于LightGBM的特征选择算法筛选出工业物联网数据中最有效的特征集合;然后,结合深度学习算法将多层感知器网络的隐藏层作为PPO2算法中的价值网络和策略网络的共享网络结构;最后,基于PPO2算法构建入侵检测模型,并使用ReLU(Rectified Linear Unit)进行分类输出。在美国能源部橡树岭国家实验室公开发布的工业物联网真实数据集上开展的大量实验表明,所提出的入侵检测系统在检测对工业物联网的多种类型网络攻击时,获得了99.09%的准确率,且在准确率、精密度、召回率、F1评分等指标上均优于目前基于LSTM,CNN,RNN等深度学习模型和DDQN,DQN等深度强化学习模型的入侵检测系统。
中图分类号:
[1]ZHOU W G. Analysis of Hidden Dangers of Industrial Internet of Things and Exploration of Protection Strategies[J].Electro-nics World,2019(21):13-18. [2]LING M H,YAU K L A,QADIR J,et al.Application of reinforcement learning for security enhancement in cognitive radio networks[J].Applied Soft Computing,2015,37:809-829. [3]LU X,XIAO L,XU T,et al.Reinforcement Learning BasedPHY Authentication for VANETs[J].IEEE Transactions on Vehicular Technology,2020,69(3):3068-3079. [4]LOPEZ-MARTIN M,CARRO B,SANCHEZ-ESGUEVILLASA.Application of deep reinforcement learning to intrusion detection for supervised problems[J].Expert Systems with Applications,2020,141:112963. [5]HSU Y F,MATSUOKA M.A Deep Reinforcement LearningApproach for Anomaly Network Intrusion Detection System[C]//2020 IEEE 9th International Conference on Cloud Networking (CloudNet).2020:1-6. [6]PENG A N,ZHOU W,JIA Y,et al. Overview of Research on Security of Internet of Things Operating System[J]. Journal on Communications,2018,39(3):22-34. [7]AL-HAWAWREH M,MOUSTAFA N,SITNIKOVA E.Identification of malicious activities in industrial internet of things based on deep learning models[J].Journal of Information Secu-rity and Applications,2018,41:1-11. [8]ROY B,CHEUNG H.A Deep Learning Approach for Intrusion Detection in Internet of Things using Bi-Directional Long Short-Term Memory Recurrent Neural Network[C]//28th International Telecommunication Networks and Applications Confe-rence (ITNAC).2018:1-6. [9]YANG H,CHENG L,CHUAH M C.Deep-Learning-BasedNetwork Intrusion Detection for SCADA Systems[C]//2019 IEEE Conference on Communications and Network Security (CNS).Washington,DC,USA:IEEE,2019:3-5. [10]ISMAIL M,SHAABAN M,NAIDU M,et al.Deep LearningDetection of Electricity Theft Cyber-Attacks in Renewable Distributed Generation[C]//IEEE Transactions on Smart Grid,2020:3428-3431. [11]LI B,WU Y,SONG J,et al.DeepFed:Federated Deep Learning for Intrusion Detection in Industrial Cyber-Physical Systems[J].IEEE Transactions on Industrial Informatics,2021,17(8):5615-5624. [12]KURT M N,OGUNDIJO O,LI C,et al.Online Cyber-Attack Detection in Smart Grid:A Reinforcement Learning Approach[J].IEEE Transactions on Smart Grid,2019,10(5):5174-5185. [13]SETHI K,EDUPUGANTI S,KUMAR R,et al.A context-aware robust intrusion detection system:a reinforcement learning-based approach[J].International Journal of Information Security,2020,19:657-678. [14]OTOUM S,KANTARCI B,MOUFTAH H.Empowering Reinforcement Learning on Big Sensed Data for Intrusion Detection[C]//2019 IEEE International Conference on Communications(ICC 2019).2019:1-7. [15]CAMINERO G,LOPEZ-MARTIN M,CARRO B.Adversarialenvironment reinforcement learning algorithm for intrusion detection[J].Computer Networks,2019,159:96-109. [16]SONG J,LI B,WU Y,et al.ReAL:A New ResNet-ALSTM Based Intrusion Detection System for the Internet of Energy[C]//2020 IEEE 45th Conference on Local Computer Networks (LCN).2020:491-496. [17]NAHLER G.Pearson correlation coefficient[J].Dictionary of Pharmaceutical Medicine,2009,1025:132-132. [18]WANG H,CHEN H Y,LIU S F.Intrusion Detection SystemBased on Improved Naive Bayes Algorithm[J].Computer Scien-ce,2014,41(4):111-115,119. [19]WU Y,MANSIMOV E,LIAO S.Scalable Trust-Region Method for Deep Reinforcement Learning Using Kronecker-Factored Approximation[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems.California:Curran Associates Inc,2017:5285-5294. [20]MNIH V,BADIA A P,MIRZA M,et al.Asynchronous Me-thods for Deep Reinforcement Learning[C]//International Conference on Machine Learning(PMLR 2016).2016:1928-1937. [21]SCHULMAN J,WOLSKI F,DHARIWAL P.Proximal Policy Optimization Algorithms[EB/OL].http://arxiv.org/abs/1707.06347. [22]HILL A.Stable-baselines[EB/OL].(2021).https://stablebase-lines.readthedocs.io/en/master/. [23]MORRIS T,GAO W.Industrial Control System Traffic DataSets for Intrusion Detection Research[C]//International Conference on Critical Infrastructure Protection.Berlin,Heidelberg:Springer,2014:65-78. [24]VAN HASSELT H,GUEZ A,SILVER D.Deep Reinforcement Learning with Double Q-learning[EB/OL].http://arxiv.org/abs/1509.06461v2. [25]MIRZA A,COSAN S.Computer network intrusion detectionusing sequential LSTM Neural Networks autoencoders[C]//2018 26th Signal Processing and Communications Applications Conference (SIU).Izmir,Turkey:IEEE,2018:2-5. [26]MELIBOYEV A,ALIKHANOV J,KIM W.1D CNN BasedNetwork Intrusion Detection with Normalization on Imbalanced Data[EB/OL].http://arxiv.org/abs/2003.00476v2. [27]YIN C L,ZHU Y F,FEI J L,et al.A Deep Learning Approach for Intrusion Detection Using Recurrent Neural Networks[J].IEEE Access,2017,5:21954-21961. |
[1] | 柳杰灵, 凌晓波, 张蕾, 王博, 王之梁, 李子木, 张辉, 杨家海, 吴程楠. 基于战术关联的网络安全风险评估框架 Network Security Risk Assessment Framework Based on Tactical Correlation 计算机科学, 2022, 49(9): 306-311. https://doi.org/10.11896/jsjkx.210600171 |
[2] | 王磊, 李晓宇. 基于随机洋葱路由的LBS移动隐私保护方案 LBS Mobile Privacy Protection Scheme Based on Random Onion Routing 计算机科学, 2022, 49(9): 347-354. https://doi.org/10.11896/jsjkx.210800077 |
[3] | 熊丽琴, 曹雷, 赖俊, 陈希亮. 基于值分解的多智能体深度强化学习综述 Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization 计算机科学, 2022, 49(9): 172-182. https://doi.org/10.11896/jsjkx.210800112 |
[4] | 于滨, 李学华, 潘春雨, 李娜. 基于深度强化学习的边云协同资源分配算法 Edge-Cloud Collaborative Resource Allocation Algorithm Based on Deep Reinforcement Learning 计算机科学, 2022, 49(7): 248-253. https://doi.org/10.11896/jsjkx.210400219 |
[5] | 李梦菲, 毛莺池, 屠子健, 王瑄, 徐淑芳. 基于深度确定性策略梯度的服务器可靠性任务卸载策略 Server-reliability Task Offloading Strategy Based on Deep Deterministic Policy Gradient 计算机科学, 2022, 49(7): 271-279. https://doi.org/10.11896/jsjkx.210600040 |
[6] | 赵冬梅, 吴亚星, 张红斌. 基于IPSO-BiLSTM的网络安全态势预测 Network Security Situation Prediction Based on IPSO-BiLSTM 计算机科学, 2022, 49(7): 357-362. https://doi.org/10.11896/jsjkx.210900103 |
[7] | 陶礼靖, 邱菡, 朱俊虎, 李航天. 面向网络安全训练评估的受训者行为描述模型 Model for the Description of Trainee Behavior for Cyber Security Exercises Assessment 计算机科学, 2022, 49(6A): 480-484. https://doi.org/10.11896/jsjkx.210800048 |
[8] | 杜鸿毅, 杨华, 刘艳红, 杨鸿鹏. 基于网络媒体的非线性动力学信息传播模型 Nonlinear Dynamics Information Dissemination Model Based on Network Media 计算机科学, 2022, 49(6A): 280-284. https://doi.org/10.11896/jsjkx.210500043 |
[9] | 吕鹏鹏, 王少影, 周文芳, 连阳阳, 高丽芳. 基于进化神经网络的电力信息网安全态势量化方法 Quantitative Method of Power Information Network Security Situation Based on Evolutionary Neural Network 计算机科学, 2022, 49(6A): 588-593. https://doi.org/10.11896/jsjkx.210200151 |
[10] | 邓凯, 杨频, 李益洲, 杨星, 曾凡瑞, 张振毓. 一种可快速迁移的领域知识图谱构建方法 Fast and Transmissible Domain Knowledge Graph Construction Method 计算机科学, 2022, 49(6A): 100-108. https://doi.org/10.11896/jsjkx.210900018 |
[11] | 魏辉, 陈泽茂, 张立强. 一种基于顺序和频率模式的系统调用轨迹异常检测框架 Anomaly Detection Framework of System Call Trace Based on Sequence and Frequency Patterns 计算机科学, 2022, 49(6): 350-355. https://doi.org/10.11896/jsjkx.210500031 |
[12] | 谢万城, 李斌, 代玥玥. 空中智能反射面辅助边缘计算中基于PPO的任务卸载方案 PPO Based Task Offloading Scheme in Aerial Reconfigurable Intelligent Surface-assisted Edge Computing 计算机科学, 2022, 49(6): 3-11. https://doi.org/10.11896/jsjkx.220100249 |
[13] | 洪志理, 赖俊, 曹雷, 陈希亮, 徐志雄. 基于遗憾探索的竞争网络强化学习智能推荐方法研究 Study on Intelligent Recommendation Method of Dueling Network Reinforcement Learning Based on Regret Exploration 计算机科学, 2022, 49(6): 149-157. https://doi.org/10.11896/jsjkx.210600226 |
[14] | 李鹏, 易修文, 齐德康, 段哲文, 李天瑞. 一种基于深度学习的供热策略优化方法 Heating Strategy Optimization Method Based on Deep Learning 计算机科学, 2022, 49(4): 263-268. https://doi.org/10.11896/jsjkx.210300155 |
[15] | 欧阳卓, 周思源, 吕勇, 谭国平, 张悦, 项亮亮. 基于深度强化学习的无信号灯交叉路口车辆控制 DRL-based Vehicle Control Strategy for Signal-free Intersections 计算机科学, 2022, 49(3): 46-51. https://doi.org/10.11896/jsjkx.210700010 |
|