Computer Science ›› 2023, Vol. 50 ›› Issue (8): 251-259.doi: 10.11896/jsjkx.220700277

• Information Security • Previous Articles     Next Articles

Survey of DGA Domain Name Detection Based on Character Feature

WANG Yu1, WANG Zuchao1, PAN Rui2   

  1. 1 School of Science,China University of Geosciences(Beijing),Beijing 100083,China
    2 China Academy of Information and Communications Technology,Beijing 100191,China
  • Received:2022-07-28 Revised:2022-11-24 Online:2023-08-15 Published:2023-08-02
  • About author:WANG Yu,born in 1996,postgraduate,is a member of China Computer Federation.Her main research interests include data mining and deep learning in DGA domain name detection.
    PAN Rui,born in 1988,master,senior engineer.His main research interests include cyber security,data governance and data security.
  • Supported by:
    National Natural Science Foundation of China(62071152).

Abstract: Recent years have seen extensive adoption of domain generation algorithms(DGA) by botnets.Efficient detection of DGA domain name is of great significance for discovering botnets and ensuring cyber security.DGA domain name detection me-thod based on character feature can complete the detection only by using the domain name string.It is a real-time detection me-thod,and has become a hot spot in the research on DGA domain name detection.Research on such methods shows DGA domain name can be effectively detected by using traditional machine learning or deep learning.However,for wordlist-based DGA domain name,shorter-length DGA domain name,or new variant DGA domain name,it is still necessary to improve the detection ability by improving word embedding method,introducing attention mechanisms,or joining adversarial samples,etc.Finally,this paper summarizes the above methods,analyzes their advantages and existing problems,and proposes future research directions and key issues that need to be addressed for DGA domain name detection.

Key words: Cyber security, DGA domain name detection, Machine learning, Deep learning, Word embedding, Attention mechanism, Adversarial example

CLC Number: 

  • TP393.08
[1]NIU W N,JIANG T Y,ZHANG X S,et al.Fast-flux botnet detection method based on spatiotemporal feature of network traffic[J].Journal of Electronics & Information Technology,2020,42(8):1872-1880.
[2]ZOU F,TAN Y,WANG L,et al.Botnet detection based on ge-nerative adversarial network[J].Journal on Communications,2021,42(7):95-106.
[3]DEHKORDI M J,SADEGHIYAN B.Reconstruction of C&Cchannel for P2P botnet[J].IET Communications,2020,14(8):1318-1326.
[4]WANG Z,GUO Y.Neural networks based domain name genera-tion[J/OL].Journal of Information Security and Applications,2021,61:102948.https://doi.org/10.1016/j.jisa.2021.102948.
[5]PLOHMANN D,YAKDAN K,KLATT M A,et al.A comprehensive measurement study of domain generating malware[C]//25th USENIX Security Symposium.Austin,TX,USA:USENIX Association,2016:263-278.
[6]ALMASHHADANI A O,KAIIALI M,CARLIN D,et al.Maldom Detector:A system for detecting algorithmically generated domain names with machine learning[J/OL].Computers & Security,2020,93:101787.https://doi.org/10.1016/j.redox.2020.101787.
[7]BARABOSCH T,WICHMANN A,LEDER F,et al.Automatic extraction of domain name generation algorithms from current malware[C]//NATO Symposium IST-111 on Information Assurance and Cyber Defense.Koblenz,2012.
[8]YADAV S,REDDY A K K,REDDY A L,et al.Detecting algorithmically generated malicious domain names[C]//Proceedings of the 10th ACM SIGCOMM conference on Internet measurement.Melbourne,Australia,2010:48-61.
[9]YADAV S,REDDY K,REDDY N,et al.Detecting algorithmically generated domain-flux attacks with DNS traffic analysis[J].IEEE/ACM Transactions on Networking,2012,20(5):1663-1677.
[10]ANTONAKAKIS M,PERDISCI R,NADJI Y,et al.From{throw-away} traffic to bots:detecting the rise of {DGA-based} malware[C]//21st USENIX Security Symposium(USENIX Security 12).Bellevue,WA,2012:491-506.
[11]MAC H,TRAN D,TONG V,et al.DGA botnet detection using supervised learning methods[C]//Proceedings of the Eighth International Symposium on Information and Communication Technology.Nha Trang City,Viet Nam,2017:211-218.
[12]HUANG J,ZHANG G,SHEN Y.DGA domain name detection based on SVM under grey wolf optimization algorithm[C]//2019 IEEE 10th International Conference on Software Enginee-ring and Service Science(ICSESS).Newyork:IEEE Press,2019:245-248.
[13]LISON P,MAVROEIDIS V.Automatic detection of malware-generated domains with recurrent neural models[J].arXiv:1709.07102,2017.
[14]MU Z C.Predicting Domain generation algorithms with N-Gram models[C]//2022 International Conference on Big Data,Information and Computer Network(BDICN).Newyork:IEEE Press,2022:31-38.
[15]WANG H.Botnet detection via machine learning techniques[C]//2022 International Conference on Big Data,Information and Computer Network(BDICN).IEEE,2022:831-836.
[16]WOODBRIDGE J,ANDERSON H S,AHUJA A,et al.Predicting domain generation algorithms with long short-term memory networks[J].arXiv:1611.00791,2016.
[17]CHEN L G,ZHANG Y D,GENG G G,et al.Detection of random generated names using recurrent neural network with gated recurrent unit[J].Computer Systems & Applications,2018,27(8):198-202.
[18]SHAHZAD H,SATTAR A R,SKANDARANIYAM J.DGAdomain detection using deep learning[C]//2021 IEEE 5th International Conference on Cryptography,Security and Privacy(CSP).Newyork:IEEE Press,2021:139-143.
[19]TRAN D,MAC H,TONG V,et al.A LSTM based framework for handling multiclass imbalance in DGA botnet detection[J].Neurocomputing,2018,275:2401-2413.
[20]CHEN Y,PANG B,SHAO G,et al.DGA-based botnet detec-tion toward imbalanced multiclass learning[J].Tsinghua Science and Technology,2021,26(4):387-402.
[21]KIM Y.Convolutional neural networks for sentence classification[C]//The 2014 Conference on Empirical Methods in Natural Language Processing(EMNLP).Doha,Qatar,2014:1746-1751.
[22]ZHANG X,ZHAO J,LECUN Y.Character-level convolutional networks for text classification[J/OL].Advances in Neural Information Processing Systems,2015,28.https://doi.org/10.48550/arXiv.1509.01626.
[23]SAXE J,BERLIN K.eXpose:A character-level convolutionalneural network with embeddings for detecting malicious URLs,file paths and registry keys[J].arXiv:1702.08568,2017.
[24]YU B,PAN J,HU J,et al.Character level based detection of DGA domain names[C]//2018 International Joint Conference on Neural Networks.Rio de Janeiro,Brazil,2018:1-8.
[25]ZHOU S,LIN L,YUAN J,et al.CNN-based DGA detection with high coverage[C]//2019 IEEE International Conference on Intelligence and Security Informatics(ISI).New York:IEEE Press,2019:62-67.
[26]YANG L H,LIU G J,ZHAI J T,et al.Improved algorithm for detection of the malicious domain name based on the convolutional neural network[J].Journal of Xidian University,2020,47(1):37-43.
[27]ZHOU C,SUN C,LIU Z,et al.A C-LSTM neural network for text classification[J].Expert Systems with Applications,ELSEVIER,2017,72:221-230.
[28]ZHANG B,LIAO R J.Malicious domain name detection model based on CNN and LSTM[J].Journal of Electronics & Information Technology,2021,43(10):2944-2951.
[29]XU G T,SHENG Z W.DGA malicious domain name detection method based on fusion of CNN and LSTM[J].Netinfo Security,2021,21(10):41-47.
[30]PEI L Z,ZHAO Y J,WANG Z,et al.Comparison of DGA Domain Detection Models Using Deep Learning[J].Computer Science,2019,46(5):111-115.
[31]BERMAN D S.DGA CapsNet:1D application of capsule networks to DGA detection[J].Information,2019,10(5):157.
[32]MIKOLOV T,SUTSKEVER I,CHEN K,et al.Distributed representations of words and phrases and their compositionality[C]//The 27th Advances in Neural Information Processing Systems.Stateline,USA,2013:3111-3119.
[33]PENNINGTON J,SOCHER R,MANNING C.Glove:globalvectors for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Proces-sing.Doha.Qatar,2014:1532-1543.
[34]PETERS M E,NEUMANN M,IYYER M,et al.Deep contex-tualized word representations[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics.New Orleans,2018:2227-2237.
[35]BROWN T,MANN B,RYDER N,et al.Language models are few-shot learners[J].Advances in Neural Information Proces-sing Systems,2020,33:1877-1901.
[36]HOWARD J,RUDER S.Universal language model fine-tuningfor text classification[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics.Melbourne,Australia,2018:328-339.
[37]FU Y,YU L,HAMBOLU O,et al.Stealthy domain generation algorithms[J].IEEE Transactions on Information Forensics and Security,2017,12(6):1430-1443.
[38]KOH J J,RHODES B.Inline detection of domain generation algorithms with context-sensitive word embeddings[C]//2018 IEEE International Conference on Big Data(Big Data).New York:IEEE Press,2018:2966-2971.
[39]DU P,DING S F.A DGA domain name detection method based on deep learning models with mixed word embedding[J].Journal of Computer Research and Development,2020,57(2):433-446.
[40]HU P C,DIAO L L,YE H,et al.DGA domains detection based on artificial and depth features[J].Computer Science,2020,47(9):311-317.
[41]PAN R,CHEN J,MA H Y,et al.Using extended character feature in Bi-LSTM for DGA domain name detection[C]//2022 IEEE/ACIS 22nd International Conference on Computer and Information Science(ICIS).New York:IEEE Press,2022:115-118.
[42]YANG L,LIU G,LIU W,et al.Detecting multielement algo-rithmically generated domain names based on adaptive embedding model[J].Security and Communication Networks,2021,2021(6):1-20.
[43]BAHDANAU D,CHO K,BENGIO Y.Neural machine translation by jointly learning to align and translate[J].arXiv:1409.0473,2014.
[44]QIAO Y,ZHANG B,ZHANG W,et al.DGA domain name classification method based on long short-term memory with attention mechanism[J].Applied Sciences,2019,9(20):4205.
[45]TUAN T A,LONG H V,TANIAR D.On Detecting and Classifying DGA Botnets and their Families[J/OL].Computers & Security,2022,113:102549.https://doi.org/10.1016/j.cose.2021.102549.
[46]ZHAO K,GUO W,QIN F,et al.D3-SACNN:DGA domain detection with self-Attention convolutional network[J].IEEE Access,2021,10:69250-69263.
[47]YANG L,LIU G,WANG J,et al.Fast3DS:A real-time full-convolutional malicious domain name detection system[J/OL].Journal of Information Security and Applications,2021,61:102933.https://doi.org/10.1016/j.jisa.2021.102933.
[48]REN F,JIANG Z,WANG X,et al.A DGA domain names detection modeling method based on integrating an attention mechanism and deep neural network[J].Cybersecurity,2020,3(1):1-13.
[49]YANG L H,LIU G J,DAI Y W,et al.Detecting stealthy domain generation algorithms using heterogeneous deep neural network framework[J].IEEE Access,2020,8:82876-82889.
[50]NAMGUNG J,SON S,MOON Y S.Efficient deep learningmodels for DGA domain detection[J].Security and Communication Networks,2021,2021(2):1-15.
[51]LIANG J,CHEN S,WEI Z,et al.HAGDetector:Heterogeneous DGA Domain Name Detection Model[J].Computers & Security,2022:102803.
[52]SZEGEDY C,ZAREMBA W,SUTSKEVER I,et al.Intriguing properties of neural networks[J].arXiv:1312.6199,2013.
[53]ANDERSON H S,WOODBRIDGE J,FILAR B.DeepDGA:adversarially-tuned domain generation and detection[C]//Procee-dings of the 2016 ACM Workshop on Artificial Intelligence and Security.2016:13-21.
[54]PECK J,NIE C,SIVAGURU R,et al.CharBot:A simple and ef-fective method for evading DGA classifiers[J].IEEE Access,2019,7:91759-91771.
[55]LIU X Y,LIU J M,LIU C,et al.Novel botnet DGA domain detection method based on character level sliding window and deep residual network[J].Acta Electronica Sinica,2022,50(1):250-256.
[1] LIU Xiang, ZHU Jing, ZHONG Guoqiang, GU Yongjian, CUI Liyuan. Quantum Prototype Clustering [J]. Computer Science, 2023, 50(8): 27-36.
[2] ZHANG Yian, YANG Ying, REN Gang, WANG Gang. Study on Multimodal Online Reviews Helpfulness Prediction Based on Attention Mechanism [J]. Computer Science, 2023, 50(8): 37-44.
[3] SONG Xinyang, YAN Zhiyuan, SUN Muyi, DAI Linlin, LI Qi, SUN Zhenan. Review of Talking Face Generation [J]. Computer Science, 2023, 50(8): 68-78.
[4] WANG Xu, WU Yanxia, ZHANG Xue, HONG Ruize, LI Guangsheng. Survey of Rotating Object Detection Research in Computer Vision [J]. Computer Science, 2023, 50(8): 79-92.
[5] ZHOU Ziyi, XIONG Hailing. Image Captioning Optimization Strategy Based on Deep Learning [J]. Computer Science, 2023, 50(8): 99-110.
[6] TENG Sihang, WANG Lie, LI Ya. Non-autoregressive Transformer Chinese Speech Recognition Incorporating Pronunciation- Character Representation Conversion [J]. Computer Science, 2023, 50(8): 111-117.
[7] ZHANG Xiao, DONG Hongbin. Lightweight Multi-view Stereo Integrating Coarse Cost Volume and Bilateral Grid [J]. Computer Science, 2023, 50(8): 125-132.
[8] WANG Jiahao, ZHONG Xin, LI Wenxiong, ZHAO Dexin. Human Activity Recognition with Meta-learning and Attention [J]. Computer Science, 2023, 50(8): 193-201.
[9] LI Yang, LI Zhenhua, XIN Xianlong. Attack Economics Based Fraud Detection for MVNO [J]. Computer Science, 2023, 50(8): 260-270.
[10] ZHOU Fengfan, LING Hefei, ZHANG Jinyuan, XIA Ziwei, SHI Yuxuan, LI Ping. Facial Physical Adversarial Example Performance Prediction Algorithm Based on Multi-modal Feature Fusion [J]. Computer Science, 2023, 50(8): 280-285.
[11] ZHU Boyu, CHEN Xiao, SHA Letian, XIAO Fu. Two-layer IoT Device Classification Recognition Model Based on Traffic and Text Fingerprints [J]. Computer Science, 2023, 50(8): 304-313.
[12] ZHU Pengzhe, YAO Yuan, LIU Zijing, XI Ruicheng. Compiler-supported Program Stack Space Layout Runtime Randomization Method [J]. Computer Science, 2023, 50(8): 314-320.
[13] LU Xingyuan, CHEN Jingwei, FENG Yong, WU Wenyuan. Privacy-preserving Data Classification Protocol Based on Homomorphic Encryption [J]. Computer Science, 2023, 50(8): 321-332.
[14] WANG Mingxia, XIONG Yun. Disease Diagnosis Prediction Algorithm Based on Contrastive Learning [J]. Computer Science, 2023, 50(7): 46-52.
[15] SHEN Zhehui, WANG Kailai, KONG Xiangjie. Exploring Station Spatio-Temporal Mobility Pattern:A Short and Long-term Traffic Prediction Framework [J]. Computer Science, 2023, 50(7): 98-106.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!