Computer Science ›› 2022, Vol. 49 ›› Issue (11A): 210800237-6.doi: 10.11896/jsjkx.210800237

• Information Security • Previous Articles     Next Articles

Detection of Malicious Behavior in Encrypted Traffic Based on Heuristic Search Feature Selection

YU Sai-sai1, WANG Xiao-juan2, ZHANG Qian-qian3   

  1. 1 Consensus 30 Research Institute of China Electronics Technology Group,Chengdu 610096,China
    2 School of Electronic Engineering,Beijing University of Posts and Telecommunications,Beijing 100089,China
    3 Naval Academy Library,Bengbu,Anhui 233040,China
  • Online:2022-11-10 Published:2022-11-21
  • About author:YU Sai-sai,born in 1982,Ph.D,senior engineer.His main research interest includes cyber security and so on.
    WANG Xiao-juan,Ph.D,associate professor.Her main research interests include cyber security,complex networks,deep learning and so on.

Abstract: With the proportion of encrypted traffic in the network increasing,there are more and more malicious behaviors hidden in the encrypted traffic,which makes the situation of network security more and more serious.Encrypted traffic with some malicious behavior contains a variety of traffic characteristics,among which there is some redundancy.Redundant features will increase the detection time and reduce the efficiency of model detection.Based on the principle of heuristic search strategy,this paper selects many different features of encrypted traffic and finds out the representative combination of features.Firstly,the feature importance is sorted according to the random forest algorithm,and the features that have a great impact on the classification results are selected.Then,the similarity between all features is calculated by Pearson correlation coefficient,and the relatively independent feature combinations are selected.Experimental results on the data set CTU-13 show that,by screening representative feature combinations,detection time is reduced and the detection efficiency of encrypted traffic malicious behavior can be improved without decreasing the detection accuracy.

Key words: Encrypted traffic, Malicious behavior, Heuristic search strategy, Feature selection

CLC Number: 

  • TP309
[1]Cisco.2018 Annual Cybersecurity Report:The evolution of malware and rise of artificial intelligence[R/OL].(2018-02)[2019-07-22].https://newsroom.cisco.com/c/r/newsroom/en/us/a/y2018/m02/cisco-2018-annual-cybersecurity-report-reveals-se-curi-ty-leaders-rely-on-and-invest-in-automation-machine-learning-and-artificial-intelligence-to-defen.html.
[2]ZHEN C Z.Research on encrypted traffic type identificationbased on DPI and machine learning[J].Information Communication,2018,31(4):258-260.
[3]WANG W,ZHU M,WANG J,et al.End-to-end encrypted traffic classification with one-dimensional convolution neural networks[C]//2017 IEEE International Conference on Intelligence and Security Informatics(ISI).IEEE,2017:43-48.
[4]BAR-YANAI R,LANGBERG M,PELEG D,et al.Realtimeclassification for encrypted traffic[C]//International Sympo-sium on Experimental Algorithms.Berlin:Springer,2010:373-385.
[5]MSADEK N,SOUA R,ENGEL T.Iot device fingerprinting:Machine learning based encrypted traffic analysis[C]//2019 IEEE Wireless Communications and Networking Conference(WCNC).IEEE,2019:1-8.
[6]REZAEI S,LIU X.Deep learning for encrypted traffic classification:An overview[J].IEEE Communications Magazine,2019,57(5):76-81.
[7]CHENG L Y,YONG S,ZHI X.Android malicious behavior detection method based on reverse engineering[J].Information Security and Confidentiality of Communications,2015(4):83-87.
[8]BERLIN K,SLATER D,SAXE J.Malicious behavior detection using windows audit logs [C]//Proceedings of the 8th ACM Workshop on Artificial Intelligence and Security.2015:35-44.
[9]YANG M,WANG S,LING Z,et al.Detection of malicious behavior in android apps through API calls and permission uses analysis[J].Concurrency and Computation:Practice and Experience,2017,29(19):e4172.1-e4172.13.
[10]Aqniu.一篇报告了解国内首个针对加密流量的检测引擎[EB/OL].(2019-3-15)[2019-7-22].https://www.aqniu.com/tools tech/45207.html.
[11]BIN H,HONG Z Z,HONG Y L,et al.TLS malicious traffic detection based on the combined characteristics of message payload and flow fingerprint.[J/OL].http://kns.cnki.net/kcms/detail/31.1289.TP.20191216.1035.003.html.
[12]LE T Y,MING H X,MIAO M.Analysis of the SSL protocol working process[J].Cybersecurity skills Surgery and Application,2017(7):36-38.
[13]ANDERSON B,MCGREW D.Identifying encrypted malwaretraffic with contextual flow data[C] //Proceedings of the 2016 ACM Workshop on Artificial Intelligence and Security.ACM,2016:35-46.
[14]FAN X Y.SSL/TLS protocol security research[D].Nanjing:Southeast University,2017.
[15]JING J,ZHI Z Y.Spark platform weighted hierarchical subspace randomized forest arithmetic research[J/OL].[2022-02-25].http://kns.cnki.net/kcms/detail/42.1671.TP.20191122.1607.022.html.
[16]WU Y L,KE Y T,CHEN Z,et al.Application of alternating decision tree with AdaBoost and bagging ensembles for landslide susceptibility mapping[J].Catena,2022,187:104396.
[17]MALIK J,KAUSHAL R.CREDROID:Android malware detection by network traffic analysis[C]//Workshop on Privacy Aware Mobile Computing.New York:ACM,2016.
[18]XU Y W.Research on HTTPS tunnel traffic detection technology based on fingerprint and statistical characteristics[D].Xi’an:Xidian University,2019.
[19]FENG D C,LIU Z T,WANG X D,et al.Machine learning-based compressive strength prediction for concrete:An adaptive boosting approach[J].Construction and Building Materials,2022,230:117000.
[20]XUAN Z Z.Research on mobile traffic recognition and anomaly detection based on machine learning[D]Chengdu:University of Electronic Science and Technology of China,2019.
[21]DREGER H,FELDMANN A.Dynamic application-layer protocol analysis for network intrusion detection[C]//Proceedings of the 15th USENIX Security Symposium.2006.
[22]BASET S,SCHULZ RINNE H.An analysis of the Skype peer-to-peer internet telephony protocol[C]//25th IEEE International Conference on Computer Communications,ser(INFOCOM2006).IEEE,2006.
[23]LONG M R.Research and Implementation of Unknown and Encrypted Traffic Recognition Based on Convolutional Neural Network[D].Beijing:Beijing University of Posts and Telecommunications,2018.
[24]GOODFELLOWI,BENGIO Y,COURVILLE A.Deep learning[M].Massachusetts:MIT Press,2016.
[25]PAN W,QIAO C X.Encrypted traffic identification methodbased on stacked autoencoder[J].Computer Engineering,2018,44(11):140-147.
[26]VOLKAN S,OMER K,MERIH G.A Bayesian network model for prediction and analysis of possible forest fire causes[J].Forest Ecology and Management,2022,457:117723.
[27]JIE Q C,QIANG G.A feature selection method based on FGScore[J].Journal of Yibin University,2018,18(6):4-8.
[28]SONG J G.Prediction of RNA spatial structure based on heuristic search strategy[D].Tianjin:Tianjin Polytechnic University,2019.
[29]KAI L.Research on adaptive feature selection and parameter optimization algorithm of stochastic forest[D].Changchun:Changchun University of Technology,2018.
[30]LI W X,GANG S,WEN Y X,et al.Correlation study of computer science and technology professional curriculum system based on Pearson coefficient[J].Wireless Internet Technology,2019,16(21):114-115.
[1] LI Bin, WAN Yuan. Unsupervised Multi-view Feature Selection Based on Similarity Matrix Learning and Matrix Alignment [J]. Computer Science, 2022, 49(8): 86-96.
[2] HU Yan-yu, ZHAO Long, DONG Xiang-jun. Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification [J]. Computer Science, 2022, 49(7): 73-78.
[3] KANG Yan, WANG Hai-ning, TAO Liu, YANG Hai-xiao, YANG Xue-kun, WANG Fei, LI Hao. Hybrid Improved Flower Pollination Algorithm and Gray Wolf Algorithm for Feature Selection [J]. Computer Science, 2022, 49(6A): 125-132.
[4] CHU An-qi, DING Zhi-jun. Application of Gray Wolf Optimization Algorithm on Synchronous Processing of Sample Equalization and Feature Selection in Credit Evaluation [J]. Computer Science, 2022, 49(4): 134-139.
[5] SUN Lin, HUANG Miao-miao, XU Jiu-cheng. Weak Label Feature Selection Method Based on Neighborhood Rough Sets and Relief [J]. Computer Science, 2022, 49(4): 152-160.
[6] LI Zong-ran, CHEN XIU-Hong, LU Yun, SHAO Zheng-yi. Robust Joint Sparse Uncorrelated Regression [J]. Computer Science, 2022, 49(2): 191-197.
[7] WANG Pan-hong, ZHU Chang-ming. MIF-CNNIF:A Multi-classification Image Data Framework Based on CNN with Intersect Features [J]. Computer Science, 2022, 49(11A): 210800267-8.
[8] LI Yong-hong, WANG Ying, LI La-quan, ZHAO Zhi-qiang. Application of Improved Feature Selection Algorithm in Spam Filtering [J]. Computer Science, 2022, 49(11A): 211000028-5.
[9] YAN Zhen-chao, SHU Wen-hao, XIE Xin. Incremental Feature Selection Algorithm for Dynamic Partially Labeled Hybrid Data [J]. Computer Science, 2022, 49(11): 98-108.
[10] ZHANG Ye, LI Zhi-hua, WANG Chang-jie. Kernel Density Estimation-based Lightweight IoT Anomaly Traffic Detection Method [J]. Computer Science, 2021, 48(9): 337-344.
[11] YANG Lei, JIANG Ai-lian, QIANG Yan. Structure Preserving Unsupervised Feature Selection Based on Autoencoder and Manifold Regularization [J]. Computer Science, 2021, 48(8): 53-59.
[12] HOU Chun-ping, ZHAO Chun-yue, WANG Zhi-peng. Video Abnormal Event Detection Algorithm Based on Self-feedback Optimal Subclass Mining [J]. Computer Science, 2021, 48(7): 199-205.
[13] HU Yan-mei, YANG Bo, DUO Bin. Logistic Regression with Regularization Based on Network Structure [J]. Computer Science, 2021, 48(7): 281-291.
[14] ZHOU Gang, GUO Fu-liang. Research on Ensemble Learning Method Based on Feature Selection for High-dimensional Data [J]. Computer Science, 2021, 48(6A): 250-254.
[15] DING Si-fan, WANG Feng, WEI Wei. Relief Feature Selection Algorithm Based on Label Correlation [J]. Computer Science, 2021, 48(4): 91-96.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!