Computer Science ›› 2023, Vol. 50 ›› Issue (6A): 220400122-6.doi: 10.11896/jsjkx.220400122

• Information Security • Previous Articles     Next Articles

DGA Domain Name Detection Method Based on Similarity

SUN Haidong1, LIU Wanping1, HUANG Dong2   

  1. 1 College of Computer Science and Engineering,Chongqing University of Technology,Chongqing 400054,China;
    2 Key Laboratory of Advanced Manufacturing Technology of the Ministry of Education,Guizhou University,Guiyang 550025,China
  • Online:2023-06-10 Published:2023-06-12
  • About author:SUN Haidong,born in 1997,postgra-duate.His main research interests include cyber security and domain name detection. LIU Wanping,born in 1986,Ph.D,associate professor,master supervisor,is a member of China Computer Federation.His main research interests include network and information security.
  • Supported by:
    Natural Science Foundation of Chongqing,China(cstc2021jcyj-msxmX0594) and Science and Technology Research Project of Chongqing Education Commission(KJQN201901101).

Abstract: Botnets expose the Internet to a huge threat.Malicious behaviors such as distributed denial of service attacks and spam relying on botnets can cause great losses to the attack targets.The communication of the botnet is mainly based on the DGA domain name,so the domain name needs to be detected.Existing detection methods are mainly based on character encoding to extract domain name features,and then use neural networks for classification.Since only character features are considered,the detection accuracy of malicious domain names is often not high.In order to accurately detect DGA domain names,a calculation method of domain name character similarity and domain name node similarity is proposed,and malicious domain names are detected according to the similarity.First,a model based on a bidirectional gated recurrent unit neural network is constructed to screen out the algorithm with obvious features in the data set to generate domain names.Then using the recurrent neural network to cluster the selected malicious domain names,and finally calculate the similarity between the domain name to be detected in the dataset and the domain names which are malicious,and classify the domain name with the similarity greater than the threshold as the malicious domain name.Experimental results show that the method has an accuracy of 99.03% in detecting datasets containing multi-category malicious names.

Key words: DGA domain name, Botnet, Domain name detection, Similarity calculation, Gated recurrent unit

CLC Number: 

  • TP393
[1]JIANG J,ZHU G J W,DUAN H X,et al.Botnet mechanism and defense technology[J].Journal of Software,2012,23(1):82-96.
[2]LIU W,ZHONG S.Web malware spread modelling and optimal control strategies[R].Scientific Reports,7,2017.
[3]THOMAS N,PAUL K,SHEREEN F.A machine learning approach for detecting fast flux phishing hostnames[J].Journal of Information Security and Applications,2022,65:103-125.
[4]JEFFREY S,JEMAN P,JOONGHEON K et al.Proactive detection of algorithmically generated malicious domains[C]//2018 International Conference on Information Networking.2018:5-12.
[5]SEUNGWON S,GUO G.Conficker and beyond:A large-scale empirical study[C]//26th Annual Computer Security Applications Conference.2010:676-690.
[6]HUANG C,HAO S,INVERNIZZI L,et al.Gossip:Automatically Identifying Malicious Domains from Mailing List Discussions[C]//Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security.2017:494-505.
[7]JIANG Y,JIA M,ZHANG B,et al.Malicious Domain Name Detection Model Based on CNN-GRU-Attention[C]//2021 33rd Chinese Control and Decision Conference(CCDC).2021:1602-1607.
[8]ZHAO C,ZHANG Y,WANG Y.A Feature Ensemble-basedApproach to Malicious Domain Name Identification from Valid DNS Responses[C]//2020 International Joint Conference on Neural Networks(IJCNN).2020:1-7.
[9]CHANG C,CAO J J,LV G J,et al.Ground truth discovery of text data based on Bi-GRU with attention mechanism[J].Chinese Journal of Information,2020,34(2):46-55.
[10]YU G X,ZHANG Y,CUI H J,et al.Machine Learning based Design and Implementation of DGA Domain Name Detection System for Zombie Network[J].Journal of Information Security,2020,5(3):35-47.
[11]TAX D,DUIN R.Support Vector Data Description[J].Machine Learning,2004,54(1):45-66.
[12]LEYLA B,SEVI L.Exposure:A Passive DNS Analysis Service to Detect and Report Malicious Domains[J].ACM Transactions on Information and System Security(TISSEC),2014,16(4):1-28.
[13]PALANIAPPAN G,SANGEETHA S,RAJENDRAN B,et al.Malicious domain detection using machine learning on domain name features,host-based features and web-based features[J].Procedia Computer Science,2020,171:654-661.
[14]HE W,GOU G,KANG C,et al.Malicious domain detection via domain relationship and graph models[C]//2019 IEEE 38th International Performance Computing and Communications Conference(IPCCC).2019:1-8.
[15]ZANG X D,GONG J,HU X Y.Malicious domain name detection Based on AGD[J].Journal of Communications,2018,39(7):15-25.
[16]ZHANG S,ZHOU Z,LI D,et al.Attributed HeterogeneousGraph Neural Network for Malicious Domain Detection[C]//2021 IEEE 24th International Conference on Computer Supported Cooperative Work in Design(CSCWD).2021:397-403.
[17]LIANG Z,ZANG T,ZENG Y.MalPortrait:Sketch MaliciousDomain Portraits Based on Passive DNS Data[C]//2020 IEEE Wireless Communications and Networking Conference(WCNC).2020:1-8.
[18]SUN X,TONG M,YANG J,et al.HinDom:A Robust Malicious Domain Detection System based on Heterogeneous Information Network with Transductive Classification[C]//22nd International Symposium on Research in Attacks,Intrusions and Defenses.2019:399-412.
[19]SUN Y Z,YU Y T,HAN J W.Ranking-based Clustering of Heterogeneous Information Networks with Star Network Schema[C]//Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2009:797-806.
[20]SUN Y Z,HAN J W,YAN X F,et al.PathSim:Meta PathBased Top-K Similarity Search in Heterogeneous Information Networks[J].Proceedings of the VLDB Endowment,2011,4(11):992-1003.
[21]CUCCHIARELLI A,MORBIDONI C,SPALAZZI L,et al.Algorithmically Generated Malicious Domain Names Detection Based on n-Gram Features[J].Expert Systems with Applications,2021,170:114551.
[22]HWANG C,KIM H,LEE H,et al.Effective DGA-Domain Detection and Classification with Text-CNN and Additional Features[J].Electronics,2020,9(7):1070-1087.
[23]YANG L,LIU G,DAI Y,et al.Detecting Stealthy Domain Ge-neration Algorithms Using Heterogeneous Deep Neural Network Framework[J].IEEE Access,2020:82876-82889.
[1] WANG Yu, WANG Zuchao, PAN Rui. Survey of DGA Domain Name Detection Based on Character Feature [J]. Computer Science, 2023, 50(8): 251-259.
[2] JIAN Kaiyu, SHI Yaqing, HUANG Song, XU Shanshan, YANG Zhongju. Review on Similarity of Business Process Models [J]. Computer Science, 2023, 50(6): 338-350.
[3] WU Zi-yi, LI Shao-mei, JIANG Meng-han, ZHANG Jian-peng. Ontology Alignment Method Based on Self-attention [J]. Computer Science, 2022, 49(9): 215-220.
[4] ZHANG Xi-ran, LIU Wan-ping, LONG Hua. Dynamic Model and Analysis of Spreading of Botnet Viruses over Internet of Things [J]. Computer Science, 2022, 49(6A): 738-743.
[5] YANG Han, WAN You, CAI Jie-xuan, FANG Ming-yu, WU Zhuo-chao, JIN Yang, QIAN Wei-xing. Pedestrian Navigation Method Based on Virtual Inertial Measurement Unit Assisted by GaitClassification [J]. Computer Science, 2022, 49(6A): 759-763.
[6] WANG Yi, LI Zheng-hao, CHEN Xing. Recommendation of Android Application Services via User Scenarios [J]. Computer Science, 2022, 49(6A): 267-271.
[7] HU Yan-li, TONG Tan-qian, ZHANG Xiao-yu, PENG Juan. Self-attention-based BGRU and CNN for Sentiment Analysis [J]. Computer Science, 2022, 49(1): 252-258.
[8] LIU Wen-yang, GUO Yan-bu, LI Wei-hua. Identifying Essential Proteins by Hybrid Deep Learning Model [J]. Computer Science, 2021, 48(8): 240-245.
[9] YIN Jiu, CHI Kai-kai, HUAN Ruo-hong. Aspect-level Sentiment Analysis of Text Based on ATT-DGRU [J]. Computer Science, 2021, 48(5): 217-224.
[10] LI Hang, LI Wei-hua, CHEN Wei, YANG Xian-ming, ZENG Cheng. Diagnostic Prediction Based on Node2vec and Knowledge Attention Mechanisms [J]. Computer Science, 2021, 48(11A): 630-637.
[11] CHEN Ying-ren, GUO Ying-nan, GUO Xiang, NI Yi-tao, CHEN Xing. Web Page Wrapper Adaptation Based on Feature Similarity Calculation [J]. Computer Science, 2021, 48(11A): 218-224.
[12] HU Peng-cheng, DIAO Li-li, YE Hua, YANG Yan-lan. DGA Domains Detection Based on Artificial and Depth Features [J]. Computer Science, 2020, 47(9): 311-317.
[13] GUO Xin, ZHANG Geng, CHEN Qian, WANG Su-ge. Candidate Sentences Extraction for Machine Reading Comprehension [J]. Computer Science, 2020, 47(5): 198-203.
[14] ZHONG Ya,GUO Yuan-bo,LIU Chun-hui,LI Tao. User Attributes Profiling Method and Application in Insider Threat Detection [J]. Computer Science, 2020, 47(3): 292-297.
[15] ZHU Pei-pei, WANG Zhong-qing, LI Shou-shan, WANG Hong-ling. Chinese Event Detection Based on Document Information and Bi-GRU [J]. Computer Science, 2020, 47(12): 233-238.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!