Computer Science ›› 2024, Vol. 51 ›› Issue (11): 340-346.doi: 10.11896/jsjkx.231000121

• Information Security • Previous Articles     Next Articles

Malicious Encrypted Traffic Detection Method Based on Conversation Statistical Encoder Model

GONG Siyue, LIU Hui, WANG Baohui   

  1. College of Software,Beihang University,Beijing 100000,China
  • Received:2023-10-18 Revised:2024-03-07 Online:2024-11-15 Published:2024-11-06
  • About author:GONG Siyue,born in 1996,postgra-duate.His main research interests include natural language processing and malicious traffic detection.
    WANG Baohui,born in 1973,Ph.D,professor.His research interests include big data,artificial intelligence and network information security.

Abstract: With the development and widespread application of network technology,encrypted traffic has become a key technology for protecting user privacy.However,malware and attackers also use encrypted traffic to hide their behaviors and evade traditional network intrusion detection systems.Existing malicious encrypted traffic detection methods have some pro-blems.Statistics-based methods rely on expert experience for feature extraction,and features of different protocols cannot be generalized.Deep learning methods based on raw inputs have incomplete information and field padding data issues,leading to insufficient semantic representation of encrypted traffic interactions.To solve the above problems,this paper proposes a method called “conversation statistic encoder model(CSEM)”.The method draws on the transformer encoder model and introduces a new traffic packet feature parsing method,and it is different from the traditional mode of inputting byte streams into deep neural networks.The proposed method can construct fixed-length vector representations for each traffic packet without padding zeros,while avoiding dependence on specific encrypted protocols in the feature extraction process.A hybrid deep neural network is constructed to provide a new idea for malicious encrypted traffic detection.The proposed method is verified on the DataCon dataset and self- built dataset,and the experimental results on Datacon dataset show a recall of 0.991 1,precision of 0.940 7,and F1 score of 0.965 2,reaching the current best level,and the F1 score is 9% higher than that of the random forest model.

Key words: Conversation, Encrypted traffic detection, Encoder

CLC Number: 

  • TP312
[1] CNCERT.Analysis Report on China’s Internet Network Secu-rity Monitoring Data in the First Half of 2021[EB/OL].(2021-07-31)[2023-08-15].https://www.cert.org.cn/publish/main/46/2021/20210731090556980286517/20210731090556980286517_.html.
[2] JON O.Network Traffic Analysis (NTA):A Cybersecurity‘Quick Win’[EB/OL].[2023-08-15].https://www.cisco.com/c/dam/en/us/products/collateral/security/stealthwatch/stealthwatch-esg-wp.pdf.
[3] LI Y,GUO H,HOU J,et al.A Survey of Encrypted Malicious Traffic Detection[C]//2021 International Conference on Communications,Computing,Cybersecurity,and Informatics.IEEEComputer Society,2021:1-7.
[4] FANG Y,XU Y,HUANG C,et al.Against malicious SSL/TLS encryption:identify malicious traffic based on random forest[C]//Fourth International Congress on Information and Communication Technology.Springer,2020:99-115.
[5] KHRAISAT A,GONDAL I,VAMPLEW P.An anomaly intrusion detection system using C5 decision tree classifier[C]//Pacific-Asia Conference on Knowledge Discovery and Data Mining.Springer,2018:149-155.
[6] LI Y,XIA J,ZHANG S,et al.An efficient intrusion detectionsystem based on support vector machines and gradually feature removal method[J].Expert Systems with Applications,2012,39(1):424-430.
[7] LIN W,KE S,TSAI C.CANN:An intrusion detection system based on combining cluster centers and nearest neighbors[J].Knowledge-based Systems,2015,78:13-21.
[8] ASHKARI A H.CICFlowmeter-V4.0 (formerly known asISCXFlowMeter) is a network traffic Bi-flow generator and analyser for anomaly detection[EB/OL].[2021-07-05].https://github.com/ahlashkari/CICFlowMeter.
[9] ASHKARI A H,DRAPER-GIL G,MAMUN M S I,et al.Cha-racterization of tor traffic using time based features[C]//International Conference on Information Systems Security and Privacy.2017:253-262.
[10] WANG W,ZHU M,WANG J,et al.End-to-end encrypted traffic classification with one-dimensional convolution neural networks[C]//2017 IEEE International Conference on Intelligence and Security Informatics.IEEE,2017:43-48.
[11] BAZUHAIR W,LEE W.Detecting malign encrypted networktraffic using perlin noise and convolutional neural network[C]//2020 10th Annual Computing and Communication Workshop and Conference.IEEE,2020:200-206.
[12] CHENG J,HE R,E Y P,et al.Real-time encrypted traffic classification via lightweight neural networks[C]//GLOBECOM 2020-2020 IEEE Global Communications Conference.IEEE,2020:1-6.
[13] ZOU Y,ZHANG J,JIANG B.Detection Of Malicious Encrypted Traffic Based on Lstm Recurrent Neural Network[J].Compu-ter Applications and Software,2020,37(2):308-312.
[14] LIN X,XIONG G,GOU G,et al.ET-BERT:A Contextualized Datagram Representation with Pre-training Transformers for Encrypted Traffic Classification[C]//Proceedings of the ACM Web Conference 2022.2022:633-642.
[15] ZENG Y,GU H,WEI W,et al.a deep learning based network encrypted traffic classification and intrusion detection framework[J].IEEE Access,2019,7:45182-45190.
[16] BADER O,LICHY A,HAJAJ C,et al.MalDIST:From encryp-ted traffic classification to malware traffic detection and classification[C]//2022 IEEE 19th Annual Consumer Communications &Networking Conference.IEEE,2022:527-533.
[17] GU Y H,XU H,ZHANG X Q.Encrypted malicious traffic detection based on multi-granularity characterization learning[J].Journal of Computing,2023,46(9):1888-1899.
[18] WEI J H,ZHENG R F,LIU J Y.Research on malicious TLStraffic identification based on hybrid neural network[J].Computer Engineering and Applications,2021,57(7):107-114.
[19] DEVLIN J,CHANG M,LEE K,et al.Bert:Pre-training of deep bidirectional transformers for language understanding[J].ar-Xiv:1810.04805,2018.
[20] 国家密码管理局.随机性检测规范[EB/OL].(2021-10-19)[2023-08-15].https://std.samr.gov.cn/hb/search/stdHBDetailed?id=E66CC4F6F8D78B7FE05397BE0A0A6C55.
[21] ANDREW R,JUAN S,JAMES N,et al.SP 800-22 Rev.1a,A Statistical Test Suite for RNGs and PRNGs for Crypto Apps | CSRC[EB/OL].(2010-04-01)[2023-07-16].https://csrc.nist.gov/publications/detail/sp/800-22/rev-1a/final.
[22] CACHIN C.Smooth entropy and Rényi entropy[C]//Interna-tional Conference on the Theory and Applications of Cryptographic Techniques.Springer,1997:193-208.
[23] DataCon社区.DataCon开放数据集-DataCon2020-加密恶意流量数据集方向开放数据集[EB/OL].(2021-11-11)[2023-08-15].https://datacon.qianxin.com/opendata/openpage?resourcesId=6.
[1] HUANG Xiaofei, GUO Weibin. Multi-modal Fusion Method Based on Dual Encoders [J]. Computer Science, 2024, 51(9): 207-213.
[2] LI Zhe, LIU Yiyang, WANG Ke, YANG Jie, LI Yafei, XU Mingliang. Real-time Prediction Model of Carrier Aircraft Landing Trajectory Based on Stagewise Autoencoders and Attention Mechanism [J]. Computer Science, 2024, 51(9): 273-282.
[3] LI Zhi, LIN Sen, ZHANG Qiang. Edge Cloud Computing Approach for Intelligent Fault Detection in Rail Transit [J]. Computer Science, 2024, 51(9): 331-337.
[4] XU Bei, LIU Tong. Semi-supervised Emotional Music Generation Method Based on Improved Gaussian Mixture Variational Autoencoders [J]. Computer Science, 2024, 51(8): 281-296.
[5] CHEN Jie, JIN Linjiang, ZHENG Hongbo, QIN Xujia. Deep Feature Learning and Feature Clustering of Streamlines in 3D Flow Fields [J]. Computer Science, 2024, 51(7): 221-228.
[6] GAN Run, WEI Xianglin, WANG Chao, WANG Bin, WANG Min, FAN Jianhua. Backdoor Attack Method in Autoencoder End-to-End Communication System [J]. Computer Science, 2024, 51(7): 413-421.
[7] LIU Xiaohu, CHEN Defu, LI Jun, ZHOU Xuwen, HU Shan, ZHOU Hao. Speaker Verification Network Based on Multi-scale Convolutional Encoder [J]. Computer Science, 2024, 51(6A): 230700083-6.
[8] GUI Haitao, WANG Zhongqing. Personalized Dialogue Response Generation Combined with Conversation State Information [J]. Computer Science, 2024, 51(6A): 230800055-7.
[9] YUAN Zhen, LIU Jinfeng. Denoising Autoencoders Based on Lossy Compress Coding [J]. Computer Science, 2024, 51(6A): 230400172-7.
[10] ZHANG Jie, LU Miaoxin, LI Jiakang, XU Dayong, HUANG Wenxiao, SHI Xiaoping. Residual Dense Convolutional Autoencoder for High Noise Image Denoising [J]. Computer Science, 2024, 51(6A): 230400073-7.
[11] ZHAO Ziqi, YANG Bin, ZHANG Yuanguang. Hierarchical Traffic Flow Prediction Model Based on Graph Autoencoder and GRU Network [J]. Computer Science, 2024, 51(6A): 230400148-6.
[12] PENG Bo, LI Yaodong, GONG Xianfu. Improved K-means Photovoltaic Energy Data Cleaning Method Based on Autoencoder [J]. Computer Science, 2024, 51(6A): 230700070-5.
[13] WU Huinan, XING Hongjie, LI Gang. Deep Multiple-sphere Support Vector Data Description Based on Variational Autoencoder with Mixture-of-Gaussians Prior [J]. Computer Science, 2024, 51(6): 135-143.
[14] LI Zekai, BAI Zhengyao, XIAO Xiao, ZHANG Yihan, YOU Yilin. Point Cloud Upsampling Network Incorporating Transformer and Multi-stage Learning Framework [J]. Computer Science, 2024, 51(6): 231-238.
[15] LIAO Junshuang, TAN Qinhong. DETR with Multi-granularity Spatial Attention and Spatial Prior Supervision [J]. Computer Science, 2024, 51(6): 239-246.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!