计算机科学 ›› 2022, Vol. 49 ›› Issue (11A): 210800237-6.doi: 10.11896/jsjkx.210800237
俞赛赛1, 王小娟2, 章倩倩3
YU Sai-sai1, WANG Xiao-juan2, ZHANG Qian-qian3
摘要: 随着加密流量在网络中的占比越来越大,隐藏在加密流量中的恶意行为也越来越多,网络安全威胁形势越来越严峻。具有某些恶意行为的加密流量包含有多种流量特征,其特征之间本身也存在一定的冗余性。冗余的特征会增加检测时间,降低模型检测的效率。文中依据启发式搜索策略原理对加密流量包含的多种不同的特征进行筛选,找出具有代表性的特征组合。首先根据随机森林算法对特征重要度进行排序,筛选出对分类结果影响较大的特征,然后利用Pearson相关系数计算所有特征之间的相似度,筛选出彼此之间较为独立的特征组合。在数据集CTU-13上的实验结果表明,通过筛选出具有代表性的特征组合,在不降低检测准确率的情况下,减少了检测时间,提高了对加密流量恶意行为的检测效率。
中图分类号:
[1]Cisco.2018 Annual Cybersecurity Report:The evolution of malware and rise of artificial intelligence[R/OL].(2018-02)[2019-07-22].https://newsroom.cisco.com/c/r/newsroom/en/us/a/y2018/m02/cisco-2018-annual-cybersecurity-report-reveals-se-curi-ty-leaders-rely-on-and-invest-in-automation-machine-learning-and-artificial-intelligence-to-defen.html. [2]ZHEN C Z.Research on encrypted traffic type identificationbased on DPI and machine learning[J].Information Communication,2018,31(4):258-260. [3]WANG W,ZHU M,WANG J,et al.End-to-end encrypted traffic classification with one-dimensional convolution neural networks[C]//2017 IEEE International Conference on Intelligence and Security Informatics(ISI).IEEE,2017:43-48. [4]BAR-YANAI R,LANGBERG M,PELEG D,et al.Realtimeclassification for encrypted traffic[C]//International Sympo-sium on Experimental Algorithms.Berlin:Springer,2010:373-385. [5]MSADEK N,SOUA R,ENGEL T.Iot device fingerprinting:Machine learning based encrypted traffic analysis[C]//2019 IEEE Wireless Communications and Networking Conference(WCNC).IEEE,2019:1-8. [6]REZAEI S,LIU X.Deep learning for encrypted traffic classification:An overview[J].IEEE Communications Magazine,2019,57(5):76-81. [7]CHENG L Y,YONG S,ZHI X.Android malicious behavior detection method based on reverse engineering[J].Information Security and Confidentiality of Communications,2015(4):83-87. [8]BERLIN K,SLATER D,SAXE J.Malicious behavior detection using windows audit logs [C]//Proceedings of the 8th ACM Workshop on Artificial Intelligence and Security.2015:35-44. [9]YANG M,WANG S,LING Z,et al.Detection of malicious behavior in android apps through API calls and permission uses analysis[J].Concurrency and Computation:Practice and Experience,2017,29(19):e4172.1-e4172.13. [10]Aqniu.一篇报告了解国内首个针对加密流量的检测引擎[EB/OL].(2019-3-15)[2019-7-22].https://www.aqniu.com/tools tech/45207.html. [11]BIN H,HONG Z Z,HONG Y L,et al.TLS malicious traffic detection based on the combined characteristics of message payload and flow fingerprint.[J/OL].http://kns.cnki.net/kcms/detail/31.1289.TP.20191216.1035.003.html. [12]LE T Y,MING H X,MIAO M.Analysis of the SSL protocol working process[J].Cybersecurity skills Surgery and Application,2017(7):36-38. [13]ANDERSON B,MCGREW D.Identifying encrypted malwaretraffic with contextual flow data[C] //Proceedings of the 2016 ACM Workshop on Artificial Intelligence and Security.ACM,2016:35-46. [14]FAN X Y.SSL/TLS protocol security research[D].Nanjing:Southeast University,2017. [15]JING J,ZHI Z Y.Spark platform weighted hierarchical subspace randomized forest arithmetic research[J/OL].[2022-02-25].http://kns.cnki.net/kcms/detail/42.1671.TP.20191122.1607.022.html. [16]WU Y L,KE Y T,CHEN Z,et al.Application of alternating decision tree with AdaBoost and bagging ensembles for landslide susceptibility mapping[J].Catena,2022,187:104396. [17]MALIK J,KAUSHAL R.CREDROID:Android malware detection by network traffic analysis[C]//Workshop on Privacy Aware Mobile Computing.New York:ACM,2016. [18]XU Y W.Research on HTTPS tunnel traffic detection technology based on fingerprint and statistical characteristics[D].Xi’an:Xidian University,2019. [19]FENG D C,LIU Z T,WANG X D,et al.Machine learning-based compressive strength prediction for concrete:An adaptive boosting approach[J].Construction and Building Materials,2022,230:117000. [20]XUAN Z Z.Research on mobile traffic recognition and anomaly detection based on machine learning[D]Chengdu:University of Electronic Science and Technology of China,2019. [21]DREGER H,FELDMANN A.Dynamic application-layer protocol analysis for network intrusion detection[C]//Proceedings of the 15th USENIX Security Symposium.2006. [22]BASET S,SCHULZ RINNE H.An analysis of the Skype peer-to-peer internet telephony protocol[C]//25th IEEE International Conference on Computer Communications,ser(INFOCOM2006).IEEE,2006. [23]LONG M R.Research and Implementation of Unknown and Encrypted Traffic Recognition Based on Convolutional Neural Network[D].Beijing:Beijing University of Posts and Telecommunications,2018. [24]GOODFELLOWI,BENGIO Y,COURVILLE A.Deep learning[M].Massachusetts:MIT Press,2016. [25]PAN W,QIAO C X.Encrypted traffic identification methodbased on stacked autoencoder[J].Computer Engineering,2018,44(11):140-147. [26]VOLKAN S,OMER K,MERIH G.A Bayesian network model for prediction and analysis of possible forest fire causes[J].Forest Ecology and Management,2022,457:117723. [27]JIE Q C,QIANG G.A feature selection method based on FGScore[J].Journal of Yibin University,2018,18(6):4-8. [28]SONG J G.Prediction of RNA spatial structure based on heuristic search strategy[D].Tianjin:Tianjin Polytechnic University,2019. [29]KAI L.Research on adaptive feature selection and parameter optimization algorithm of stochastic forest[D].Changchun:Changchun University of Technology,2018. [30]LI W X,GANG S,WEN Y X,et al.Correlation study of computer science and technology professional curriculum system based on Pearson coefficient[J].Wireless Internet Technology,2019,16(21):114-115. |
[1] | 李斌, 万源. 基于相似度矩阵学习和矩阵校正的无监督多视角特征选择 Unsupervised Multi-view Feature Selection Based on Similarity Matrix Learning and Matrix Alignment 计算机科学, 2022, 49(8): 86-96. https://doi.org/10.11896/jsjkx.210700124 |
[2] | 胡艳羽, 赵龙, 董祥军. 一种用于癌症分类的两阶段深度特征选择提取算法 Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification 计算机科学, 2022, 49(7): 73-78. https://doi.org/10.11896/jsjkx.210500092 |
[3] | 康雁, 王海宁, 陶柳, 杨海潇, 杨学昆, 王飞, 李浩. 混合改进的花授粉算法与灰狼算法用于特征选择 Hybrid Improved Flower Pollination Algorithm and Gray Wolf Algorithm for Feature Selection 计算机科学, 2022, 49(6A): 125-132. https://doi.org/10.11896/jsjkx.210600135 |
[4] | 储安琪, 丁志军. 基于灰狼优化算法的信用评估样本均衡化与特征选择同步处理 Application of Gray Wolf Optimization Algorithm on Synchronous Processing of Sample Equalization and Feature Selection in Credit Evaluation 计算机科学, 2022, 49(4): 134-139. https://doi.org/10.11896/jsjkx.210300075 |
[5] | 孙林, 黄苗苗, 徐久成. 基于邻域粗糙集和Relief的弱标记特征选择方法 Weak Label Feature Selection Method Based on Neighborhood Rough Sets and Relief 计算机科学, 2022, 49(4): 152-160. https://doi.org/10.11896/jsjkx.210300094 |
[6] | 李宗然, 陈秀宏, 陆赟, 邵政毅. 鲁棒联合稀疏不相关回归 Robust Joint Sparse Uncorrelated Regression 计算机科学, 2022, 49(2): 191-197. https://doi.org/10.11896/jsjkx.210300034 |
[7] | 王盼红, 朱昌明. MIF-CNNIF:一种基于CNN的交叉特征的多分类图像数据框架 MIF-CNNIF:A Multi-classification Image Data Framework Based on CNN with Intersect Features 计算机科学, 2022, 49(11A): 210800267-8. https://doi.org/10.11896/jsjkx.210800267 |
[8] | 李永红, 汪盈, 李腊全, 赵志强. 一种改进的特征选择算法在邮件过滤中的应用 Application of Improved Feature Selection Algorithm in Spam Filtering 计算机科学, 2022, 49(11A): 211000028-5. https://doi.org/10.11896/jsjkx.211000028 |
[9] | 闫振超, 舒文豪, 谢昕. 动态部分标记混合数据的增量式特征选择算法 Incremental Feature Selection Algorithm for Dynamic Partially Labeled Hybrid Data 计算机科学, 2022, 49(11): 98-108. https://doi.org/10.11896/jsjkx.210900076 |
[10] | 张叶, 李志华, 王长杰. 基于核密度估计的轻量级物联网异常流量检测方法 Kernel Density Estimation-based Lightweight IoT Anomaly Traffic Detection Method 计算机科学, 2021, 48(9): 337-344. https://doi.org/10.11896/jsjkx.200600108 |
[11] | 杨蕾, 降爱莲, 强彦. 基于自编码器和流形正则的结构保持无监督特征选择 Structure Preserving Unsupervised Feature Selection Based on Autoencoder and Manifold Regularization 计算机科学, 2021, 48(8): 53-59. https://doi.org/10.11896/jsjkx.200700211 |
[12] | 侯春萍, 赵春月, 王致芃. 基于自反馈最优子类挖掘的视频异常检测算法 Video Abnormal Event Detection Algorithm Based on Self-feedback Optimal Subclass Mining 计算机科学, 2021, 48(7): 199-205. https://doi.org/10.11896/jsjkx.200800146 |
[13] | 胡艳梅, 杨波, 多滨. 基于网络结构的正则化逻辑回归 Logistic Regression with Regularization Based on Network Structure 计算机科学, 2021, 48(7): 281-291. https://doi.org/10.11896/jsjkx.201100106 |
[14] | 周钢, 郭福亮. 基于特征选择的高维数据集成学习方法研究 Research on Ensemble Learning Method Based on Feature Selection for High-dimensional Data 计算机科学, 2021, 48(6A): 250-254. https://doi.org/10.11896/jsjkx.200700102 |
[15] | 丁思凡, 王锋, 魏巍. 一种基于标签相关度的Relief特征选择算法 Relief Feature Selection Algorithm Based on Label Correlation 计算机科学, 2021, 48(4): 91-96. https://doi.org/10.11896/jsjkx.200800025 |
|