基于随机森林的入侵检测分类研究

doi:10.11896/jsjkx.200600161

计算机科学 ›› 2021, Vol. 48 ›› Issue (6A): 459-463.doi: 10.11896/jsjkx.200600161

基于随机森林的入侵检测分类研究

曹扬晨¹, 朱国胜¹, 祁小云², 邹洁¹

1 湖北大学计算机与信息工程学院武汉430062
2 湖北大学化学化工学院武汉430062

出版日期:2021-06-10 发布日期:2021-06-17
通讯作者: 朱国胜(zhuguosheng@hubu.edu.cn)
作者简介:943407866@qq.com
基金资助:
赛尔网络下一代互联网技术创新项目;基于Cloud VR和IPv6的特殊作业教育培训系统项目(NGII20180507)

Research on Intrusion Detection Classification Based on Random Forest

CAO Yang-chen¹, ZHU Guo-sheng¹, QI Xiao-yun², ZOU Jie¹

1 School of Computer and Information Engineering,Hubei University,Wuhan 430062,China
2 School of Chemistry and Chemical Engineering,Hubei University,Wuhan 430062,China

Online:2021-06-10 Published:2021-06-17
About author:CAO Yang-chen,born in 1996,postgraduate.Her main research interests include machine learning and network traffic analysis.
ZHU Guo-sheng,born in 1972,Ph.D,professor.His main research interestsinclude next-generation internet and software-defined networks.
Supported by:
CERNET Innovation Project and Special Operation Education and Training System Based on Cloud VR and IPv6(NGII20180507).

摘要/Abstract

摘要： 为了有效地检测网络的攻击行为,机器学习被广泛用于对不同类型的入侵检测进行分类,传统的决策树方法通常用单个模型训练数据,容易出现泛化误差大、过拟合的问题。为解决该问题,文中引入并行式集成学习的思想,提出基于随机森林的入侵检测模型,由于随机森林中每棵决策树都有决策权,因此可以很好地提高分类的准确性。利用NSL-KDD数据集对入侵检测模型进行训练和测试,实验结果表明,该模型的准确率可达99.91%,具有非常好的入侵检测分类效果。

关键词: 机器学习, 决策树, 入侵检测, 随机森林

Abstract: In order to effectively detect the attack behavior of the network,the machine learning method are widely used to classify different types of network intrusion detection.The traditional decision tree methods usually use a single model to training data,which is prone to generalization errors and is prone to over-fitting.To solve this problem,this paper introduces the idea of parallel integrated learning,and proposes an intrusion detection model based on random fo-rest.Since each decision tree in the random fo-rest has decision-making power,it can improve the accuracy of classification very well.By using the NSL-KDD data set to train and test the intrusion detection model,the experimental results show that the accuracy rate can reach 99.91%,which shows that the model has a very good intrusion detection classification effect.

Key words: Decision tree, Intrusion detection, Machine learning, Random forest

中图分类号:

TP181

曹扬晨, 朱国胜, 祁小云, 邹洁. 基于随机森林的入侵检测分类研究[J]. 计算机科学, 2021, 48(6A): 459-463. https://doi.org/10.11896/jsjkx.200600161

CAO Yang-chen, ZHU Guo-sheng, QI Xiao-yun, ZOU Jie. Research on Intrusion Detection Classification Based on Random Forest[J]. Computer Science, 2021, 48(6A): 459-463. https://doi.org/10.11896/jsjkx.200600161

参考文献

[1] ZHOU Z H.Machine learning [M].Beijing:Tsinghua University Press,2016:27,75-84,178-181.
[2] GRIFFITHS W,HAJARGASHT G.On GMM estimation ofdistributions from grouped data[J].Economics Letters,2015,126:122-126.
[3] HE W H,LI T S,HUANG R W.Intrusion detection model based on Improved BP algorithm in cloud environment [J].Computer Technology and Development,2016,26(2):87-90.
[4] WANG M.Network intrusion detection system based on convolutional neural network [D].Beijing:Beijing University of Posts and Telecommunications,2018.
[5] HOU C,WANG Y,SHAN H,et al.Application and optimization of stochastic forest algorithm in intrusion detection system [J].Industrial Control Computer,2019,32(6):118-120,122.
[6] WANG T,CAI X,NITHYANAND R,et al.Effective attacksand provable defense for website fingerprinting[C]//Proc of the 23rd USENIX Security Symposium.2014:143-157.
[7] PANCHENKO A,LANZE F,ZINNEN A,et al.Website fingerprinting at Internet scale[C]//Proc of Network and Distributed Sytem Security Symposium.2016:1-15.
[8] GLENNAN T,LECKIEC C,ERFANI M S.Improved classification of known and unknown network traffic flows using semi-supervised machine learning[C]//Proc of Australasian Conference on Information Security and Privacy.2016:493-501.
[9] XIE G W,ILIOFOTOUS M,FALOUTSOS M,et al.SubFlow:Towards practical flow-level traffic classification[C]//Proc of International Conference on Communications.2012:2541-2545.
[10] CHEN Z Y,YU B W,ZHANG Y,et al.Automatic mobile appliction traffic identification by convolutional neural networks[C]//Proc of IEEE TrustCom/BigDataSE/ISPA.2016:301-307.
[11] NGUYEN T T T,ARMITAGE G,BRANCHP,et al.Timelyand continuous machine-learning-based classification for interactive IP traffic[J].IEEE/ACM Transaction on Networking,2012,20(6):1880-1894.
[12] WANG Y S,XIA S T.Overview of stochastic forest algorithm in integrated learning [J].Information and Communication Technology,2018,12(1):49-55.
[13] FANG K N,WU J B,ZHU J P,et al.Summary of random forest method research [J].Forum of Statistics and Information,2011,26(3):32-38.
[14] WEI J T,GAO D M.Research on Intrusion Detection System Based on information gain and random forest classifier [J].Journal of Zhongbei University (Natural Science EditionITION),2018,39(1):74-79,88.
[15] ZHU K,ZHANG Q.Application of machine learning in network intrusion detection [J].Data Collection and Processing,2017,32(3):479-488.
[16] ZHAO S,CHEN S H.Overview and Prospect of flow recognition technology based on machine learning [J].Computer Engineering and Science,2018,40(10):1746-1756.

相关文章 15

[1]	冷典典, 杜鹏, 陈建廷, 向阳. 面向自动化集装箱码头的AGV行驶时间估计 Automated Container Terminal Oriented Travel Time Estimation of AGV 计算机科学, 2022, 49(9): 208-214. https://doi.org/10.11896/jsjkx.210700028
[2]	宁晗阳, 马苗, 杨波, 刘士昌. 密码学智能化研究进展与分析 Research Progress and Analysis on Intelligent Cryptology 计算机科学, 2022, 49(9): 288-296. https://doi.org/10.11896/jsjkx.220300053
[3]	李瑶, 李涛, 李埼钒, 梁家瑞, Ibegbu Nnamdi JULIAN, 陈俊杰, 郭浩. 基于多尺度的稀疏脑功能超网络构建及多特征融合分类研究 Construction and Multi-feature Fusion Classification Research Based on Multi-scale Sparse Brain Functional Hyper-network 计算机科学, 2022, 49(8): 257-266. https://doi.org/10.11896/jsjkx.210600094
[4]	王馨彤, 王璇, 孙知信. 基于多尺度记忆残差网络的网络流量异常检测模型 Network Traffic Anomaly Detection Method Based on Multi-scale Memory Residual Network 计算机科学, 2022, 49(8): 314-322. https://doi.org/10.11896/jsjkx.220200011
[5]	张光华, 高天娇, 陈振国, 于乃文. 基于N-Gram静态分析技术的恶意软件分类研究 Study on Malware Classification Based on N-Gram Static Analysis Technology 计算机科学, 2022, 49(8): 336-343. https://doi.org/10.11896/jsjkx.210900203
[6]	何强, 尹震宇, 黄敏, 王兴伟, 王源田, 崔硕, 赵勇. 基于大数据的进化网络影响力分析研究综述 Survey of Influence Analysis of Evolutionary Network Based on Big Data 计算机科学, 2022, 49(8): 1-11. https://doi.org/10.11896/jsjkx.210700240
[7]	陈明鑫, 张钧波, 李天瑞. 联邦学习攻防研究综述 Survey on Attacks and Defenses in Federated Learning 计算机科学, 2022, 49(7): 310-323. https://doi.org/10.11896/jsjkx.211000079
[8]	高振卓, 王志海, 刘海洋. 嵌入典型时间序列特征的随机Shapelet森林算法 Random Shapelet Forest Algorithm Embedded with Canonical Time Series Features 计算机科学, 2022, 49(7): 40-49. https://doi.org/10.11896/jsjkx.210700226
[9]	胡艳羽, 赵龙, 董祥军. 一种用于癌症分类的两阶段深度特征选择提取算法 Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification 计算机科学, 2022, 49(7): 73-78. https://doi.org/10.11896/jsjkx.210500092
[10]	王文强, 贾星星, 李朋. 自适应的集成定序算法 Adaptive Ensemble Ordering Algorithm 计算机科学, 2022, 49(6A): 242-246. https://doi.org/10.11896/jsjkx.210200108
[11]	王飞, 黄涛, 杨晔. 基于Stacking多模型融合的IGBT器件寿命的机器学习预测算法研究 Study on Machine Learning Algorithms for Life Prediction of IGBT Devices Based on Stacking Multi-model Fusion 计算机科学, 2022, 49(6A): 784-789. https://doi.org/10.11896/jsjkx.210400030
[12]	阙华坤, 冯小峰, 刘盼龙, 郭文翀, 李健, 曾伟良, 范竞敏. Grassberger熵随机森林在窃电行为检测的应用 Application of Grassberger Entropy Random Forest to Power-stealing Behavior Detection 计算机科学, 2022, 49(6A): 790-794. https://doi.org/10.11896/jsjkx.210800032
[13]	肖治鸿, 韩晔彤, 邹永攀. 基于多源数据和逻辑推理的行为识别技术研究 Study on Activity Recognition Based on Multi-source Data and Logical Reasoning 计算机科学, 2022, 49(6A): 397-406. https://doi.org/10.11896/jsjkx.210300270
[14]	姚烨, 朱怡安, 钱亮, 贾耀, 张黎翔, 刘瑞亮. 一种基于异质模型融合的 Android 终端恶意软件检测方法 Android Malware Detection Method Based on Heterogeneous Model Fusion 计算机科学, 2022, 49(6A): 508-515. https://doi.org/10.11896/jsjkx.210700103
[15]	李亚茹, 张宇来, 王佳晨. 面向超参数估计的贝叶斯优化方法综述 Survey on Bayesian Optimization Methods for Hyper-parameter Tuning 计算机科学, 2022, 49(6A): 86-92. https://doi.org/10.11896/jsjkx.210300208

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

基于随机森林的入侵检测分类研究

Research on Intrusion Detection Classification Based on Random Forest

PDF (PC)

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

Metrics

本文评价

推荐阅读 0