Computer Science ›› 2020, Vol. 47 ›› Issue (6A): 381-385.doi: 10.11896/JsJkx.191200155

• Information Security • Previous Articles     Next Articles

HTTPS Encrypted Traffic Classification Method Based on C4.5 Decision Tree

ZOU Jie1, ZHU Guo-sheng1, QI Xiao-yun2 and CAO Yang-chen1   

  1. 1 School of Computer and Information Engineering,Hubei University,Wuhan 430062,China
    2 School of Chemistry and Chemical Engineering,Hubei University,Wuhan 430062,China
  • Published:2020-07-07
  • About author:ZOU Jie, born in 1996, postgraduate.Her main research interests include machine learning and network traffic analysis.
    ZHU Guo-sheng, born in 1972, Ph.D, professor.His main research interests include next-generation Internet and software-defined networks.
  • Supported by:
    This work was supported by CERNET Innovation ProJect (NGII20180411).

Abstract: The HTTPS protocol is based on the HTTP protocol that does not have an encryption mechanism.By combining with the SSL/TLS protocol,an SSL/TLS handshake is performed between the client and the server before the data is transmitted,and the cipher suite used in the communication process is negotiated to securely exchange secret keys and implement mutual authentication.After establishing a secure communication line,the HTTP application protocol data is encrypted and transmitted,preventing the risk of eavesdropping and tampering of the communication content.The traditional payload-based method can’t handle encrypted traffic.The classification and analysis of encrypted traffic based on traffic characteristics and machine learning have become the mainstream method.By establishing a supervised learning model,based on network flow data feature engineering,under the condition of ensuring encryption integrity,the C4.5 decision tree algorithm is applied in the LAN environment to analyze the application of HTTPS encrypted data transmission stream in Tencent network,which can effectively realize accurate classification of the website HTTPS encrypted traffic.

Key words: Classification, Decision tree, Encrypted traffic, HTTPS, SSL/TLS

CLC Number: 

  • TP181
[1] HOLZ R,BRAUN L,KAMMENHUBER N,et al.The SSL Landscape:A Thorough Analysis of the X.509 PKI Using Active and Passive Measurements//Proceedings of the 2011 ACM SIGCOMM Conference on Internet Measurement Conference(IMC ’11).New York,NY,USA,ACM,2011:427-444.
[2] SUN G,XUE Y,DONG Y,et al.An Novel Hybrid Method for Effectively Classifying Encrypted Traffic//Global Telecommunications Conference (GLOBECOM 2010).IEEE,2010:1-5.
[3] ARNDT D J,ZINCIR-HEYWOOD A N.A Comparison of Three Machine Learning Techniques for Encrypted Network Traffic Analysis//IEEE Symposium on Computational Intelligence for Security and Defense Applications (CISDA).2011:107-114.
[4] MILLER B,HUANG L,JOSEPH A D,et al.Tygar.I Know Why You Went to the Clinic:Risks and Realization of HTTPS Traffic Analysis//Privacy Enhancing Technologies,volume 8555 of Lecture Notes in Computer Science.Springer International Publishing,2014:143-163.
[5] KORCZYNSKI M A.Classifying Service Flows in the Encrypted Skype Traffic//2012 IEEE International Conference on Communications (ICC).2012:1064-1068.
[6] WANG T,CAI X,NITHYANAND R,et al.Effective attacks and provable defenses for website fingerprinting//23rd {USENIX} Security Symposium ({USENIX}.2014:143-157.
[7] CHENG G,CHEN Y X.Encrypted Traffic Identification MethodBased on Support Vector Machine.Journal of Southeast University(Natural Science Edition),2017(4):655-659.
[8] CHEN W,HU L,YANG L.Fast Identification Method of Encrypted Traffic Based on Load Characteristics.Computer Engineering.2012(12):22-25.
[9] ZHANG B Y.Analysis of the Principle and Application of HTTPS Protocol.Network Security Technology and Application,2016(7):36-37.
[10] XU P,LIN S.Traffic Classification Method Based on C4.5 Decision Tree .Journal of Software,2009(10):2692-2704.
[11] LIU K.Research on feature selection in network flow classification .Yangzhou:Yangzhou University,2013:18-19.
[12] ZHOU Z H.Machine Learning .BeiJing:Tsinghua University Press,2016:73-79.
[1] CHEN Zhi-qiang, HAN Meng, LI Mu-hang, WU Hong-xin, ZHANG Xi-long. Survey of Concept Drift Handling Methods in Data Streams [J]. Computer Science, 2022, 49(9): 14-32.
[2] ZHOU Xu, QIAN Sheng-sheng, LI Zhang-ming, FANG Quan, XU Chang-sheng. Dual Variational Multi-modal Attention Network for Incomplete Social Event Classification [J]. Computer Science, 2022, 49(9): 132-138.
[3] HAO Zhi-rong, CHEN Long, HUANG Jia-cheng. Class Discriminative Universal Adversarial Attack for Text Classification [J]. Computer Science, 2022, 49(8): 323-329.
[4] WU Hong-xin, HAN Meng, CHEN Zhi-qiang, ZHANG Xi-long, LI Mu-hang. Survey of Multi-label Classification Based on Supervised and Semi-supervised Learning [J]. Computer Science, 2022, 49(8): 12-25.
[5] TAN Ying-ying, WANG Jun-li, ZHANG Chao-bo. Review of Text Classification Methods Based on Graph Convolutional Network [J]. Computer Science, 2022, 49(8): 205-216.
[6] YAN Jia-dan, JIA Cai-yan. Text Classification Method Based on Information Fusion of Dual-graph Neural Network [J]. Computer Science, 2022, 49(8): 230-236.
[7] GAO Zhen-zhuo, WANG Zhi-hai, LIU Hai-yang. Random Shapelet Forest Algorithm Embedded with Canonical Time Series Features [J]. Computer Science, 2022, 49(7): 40-49.
[8] YANG Bing-xin, GUO Yan-rong, HAO Shi-jie, Hong Ri-chang. Application of Graph Neural Network Based on Data Augmentation and Model Ensemble in Depression Recognition [J]. Computer Science, 2022, 49(7): 57-63.
[9] ZHANG Hong-bo, DONG Li-jia, PAN Yu-biao, HSIAO Tsung-chih, ZHANG Hui-zhen, DU Ji-xiang. Survey on Action Quality Assessment Methods in Video Understanding [J]. Computer Science, 2022, 49(7): 79-88.
[10] DU Li-jun, TANG Xi-lu, ZHOU Jiao, CHEN Yu-lan, CHENG Jian. Alzheimer's Disease Classification Method Based on Attention Mechanism and Multi-task Learning [J]. Computer Science, 2022, 49(6A): 60-65.
[11] LI Xiao-wei, SHU Hui, GUANG Yan, ZHAI Yi, YANG Zi-ji. Survey of the Application of Natural Language Processing for Resume Analysis [J]. Computer Science, 2022, 49(6A): 66-73.
[12] DENG Kai, YANG Pin, LI Yi-zhou, YANG Xing, ZENG Fan-rui, ZHANG Zhen-yu. Fast and Transmissible Domain Knowledge Graph Construction Method [J]. Computer Science, 2022, 49(6A): 100-108.
[13] HUANG Shao-bin, SUN Xue-wei, LI Rong-sheng. Relation Classification Method Based on Cross-sentence Contextual Information for Neural Network [J]. Computer Science, 2022, 49(6A): 119-124.
[14] LIN Xi, CHEN Zi-zhuo, WANG Zhong-qing. Aspect-level Sentiment Classification Based on Imbalanced Data and Ensemble Learning [J]. Computer Science, 2022, 49(6A): 144-149.
[15] KANG Yan, WU Zhi-wei, KOU Yong-qi, ZHANG Lan, XIE Si-yu, LI Hao. Deep Integrated Learning Software Requirement Classification Fusing Bert and Graph Convolution [J]. Computer Science, 2022, 49(6A): 150-158.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!