Computer Science ›› 2020, Vol. 47 ›› Issue (9): 311-317.doi: 10.11896/jsjkx.191000118

• Information Security • Previous Articles     Next Articles

DGA Domains Detection Based on Artificial and Depth Features

HU Peng-cheng1, DIAO Li-li2, YE Hua1, YANG Yan-lan1   

  1. 1 School of Automation,Southeast University,Nanjing 210096,China
    2 Core Technology-Research,Trend Micro China Development Center,Nanjing 210012,China
  • Received:2019-10-18 Published:2020-09-10
  • About author:HU Peng-cheng,born in 1996,postgra-duate.His main research interests include pattern recognition and intelligent system,data mining,deep learning,network security,etc.
    YE Hua,born in 1961,Ph.D.His main research interests include intelligent control,pattern recognition,computer application,intelligent building,intelligent robot,fault diagnosis,etc.

Abstract: Nowadays,various families of malware use domain generation algorithms (DGAs) to generate a large number of pseudo-random domain names to connect to C&C (Command and Control) servers,in order to launch corresponding attacks.There are two existing methods to detect DGA domains.On the one hand,it is a machine learning method based on the randomness of DGA domain name to construct artificial features.This kind of algorithm has the problems of time-consuming and laborious artificial feature engineering and high false alarm rate and so on.On the other hand,LSTM,GRU and other deep learning technologies are used to learn the sequence relationship of DGA domain names.This kind of algorithm has a low detection accuracy for DGA domain names with low randomness.Therefore,this paper proposes a domain name generic feature extraction scheme,establishes a data set containing 41 DGA domain name families,and designs a detection algorithm based on artificial features and depth features that enhances the generalization ability of the model and improves the identification types of DGA domain families.Experimental results show that DGA domain name detection algorithm based on artificial features and depth features has achieved higheraccuracy and better generalization ability than traditional deep learning methods.

Key words: Domain generation algorithms, Domain name detection, Long short-term memory, Feature engineering

CLC Number: 

  • TP393.0
[1] KÜHRER M,ROSSOW C,HOLZ T.Paint it black:Evaluating the effectiveness of malware blacklists[C]//International Workshop on Recent Advances in Intrusion Detection.Cham:Sprin-ger,2014:1-21.
[2] ANTONAKAKIS M,PERDISCI R,NADJI Y,et al.FromThrow-Away Traffic to Bots:Detecting the Rise of DGA-Based Malware[C]//21th USENIX Security Symposium.2012.
[3] YADAV S,REDDY A K K,REDDY A L N,et al.Detecting Algorithmically Generated Malicious Domain Names[C]//Proceedings of the 10th ACM SIGCOMM Conference on Internet Measurement 2010.Melbourne,Australia,ACM,2010.
[4] KRISHNAN,TAYLOR T,MONROSE F,et al.Crossingthethreshold:Detecting network malfeasance via sequential hypothesis testing[C]//2013 43rd Annual IEEE/IFIP InternationalConference on Dependable Systems and Networks (DSN).IEEE Computer Society,2013.
[5] MOWBRAY M,HAGEN J.Finding Domain-Generation Algo-rithms by Looking at Length Distribution[C]//IEEE International Symposium on Software Reliability Engineering Workshops.IEEE,2014.
[6] WOODBRIDGE J,ANDERSON H S,AHUJA A,et al.Predicting domain generation algorithms with long short-term memory networks[J].arXiv:1611.00791,2016.
[7] LISON P,MAVROEIDIS V.Automatic detection of malware-generated domains with recurrent neural models[J].arXiv:1709.07102,2017.
[8] CHEN L H,CHEN H,FANG Y Q.Detecting Domain Genera-.tion Algorithm Based on Attention Mechanism.[J].Journal of east China University of Science and Technology (Natural Science Edition),2019(3).
[9] LIAO K,ZHAO Z,DOUPEA,et al.Behind closed doors:mea-surement and analysis of CryptoLocker ransoms in Bitcoin[C]//Electronic Crime Research.IEEE,2016.
[10] SULKOSWKI A J.Cyber-Extortion:Duties and Liabilities Related to the Elephant in the Server Room[J/OL].SSRN Electronic Journal.https://ssrn.com/abstract=955962.
[11] ATZENI A,DIAZ F,LOPEZ F,et al.The Rise of AndroidBanking Trojans[J].IEEE Potentials,2020,39(3):13-18.
[12] ALBANESIUSC.Ramnit computer worm compromises 45K facebook logins[J/OL].http://www.pcmag.com/article2/0.
[13] PLOHMANN D,YAKDAN K,KLATT M.A comprehsivemeasurement study of domain generatingmalware[C]//25th USENIX Security Symposium.Austin:Usenix,2016:263-278.
[14] Gibberish-Detector[OL].https://github.com/rrenaud/Gibberi-sh-Detector.
[15] DGA feature mining[OL].https://www.cnblogs.com/bonelee/p/7640055.html.
[16] LI H.Statistical learning methods [M].Beijing:Tsinghua University Press,2012.
[17] ROBINSON A J.An application of recurrent neural nets tophone probability estimation[J].IEEE Trans.on Neural Networks,1994,5(2):298-305.
[18] BENGIO Y,BOULANGER-LEWANDOWSKI N,PASCANUR.Advances in optimizing recurrent networks[C]//2013 IEEE International Conference on Acoustics,Speech and Signal Processing.IEEE,2013.
[19] GRAVES A.Long Short-Term Memory[M]//Supervised Sequence Labelling with Recurrent Neural Networks.2012.
[20] GERS F A,SCHRAUDOLPHN N,SCHMIDHUBER.Learning Precise Timing with LSTM Recurrent Networks[J].Journal of Machine Learning Research,2003,3(1):115-143.
[1] ZHAO Jia-qi, WANG Han-zheng, ZHOU Yong, ZHANG Di, ZHOU Zi-yuan. Remote Sensing Image Description Generation Method Based on Attention and Multi-scale Feature Enhancement [J]. Computer Science, 2021, 48(1): 190-196.
[2] ZHANG Yu-shuai, ZHAO Huan, LI Bo. Semantic Slot Filling Based on BERT and BiLSTM [J]. Computer Science, 2021, 48(1): 247-252.
[3] CUI Tong-tong, WANG Gui-ling, GAO Jing. Ship Trajectory Classification Method Based on 1DCNN-LSTM [J]. Computer Science, 2020, 47(9): 175-184.
[4] YU Yi-lin, TIAN Hong-tao, GAO Jian-wei and WAN Huai-yu. Relation Extraction Method Combining Encyclopedia Knowledge and Sentence Semantic Features [J]. Computer Science, 2020, 47(6A): 40-44.
[5] LV Ze-yu, LI Ji-xuan, CHEN Ru-Jian and CHEN Dong-ming. Research on Prediction of Re-shopping Behavior of E-commerce Customers [J]. Computer Science, 2020, 47(6A): 424-428.
[6] BAO Zhen-shan, GUO Jun-nan, XIE Yuan and ZHANG Wen-bo. Model for Stock Price Trend Prediction Based on LSTM and GA [J]. Computer Science, 2020, 47(6A): 467-473.
[7] DIAO Li and WANG Ning. Research on Premium Income Forecast Based on X12-LSTM Model [J]. Computer Science, 2020, 47(6A): 512-516.
[8] HUANG Hong-wei,LIU Yu-jiao,SHEN Zhuo-kai,ZHANG Shao-wei,CHEN Zhi-min,GAO Yang. End-to-end Track Association Based on Deep Learning Network Model [J]. Computer Science, 2020, 47(3): 200-205.
[9] LIU Yun,YIN Chuan-huan,HU Di,ZHAO Tian,LIANG Yu. Communication Satellite Fault Detection Based on Recurrent Neural Network [J]. Computer Science, 2020, 47(2): 227-232.
[10] WANG Ting, XIA Yang-yu-xin, CHEN Tie-ming. Short-term Trend Forecasting of Stocks Based on Multi-category Feature System [J]. Computer Science, 2020, 47(11A): 491-495.
[11] WANG Qi-fa, WANG Zhong-qing, LI Shou-shan, ZHOU Guo-dong. Comment Sentiment Classification Using Cross-attention Mechanism and News Content [J]. Computer Science, 2020, 47(10): 222-227.
[12] LIU Hai-bo,WU Tian-bo,SHEN Jing,SHI Chang-ting. Advanced Persistent Threat Detection Based on Generative Adversarial Networks and Long Short-term Memory [J]. Computer Science, 2020, 47(1): 281-286.
[13] GE Shao-lin, YE Jian, HE Ming-xiang. Prediction Model of User Purchase Behavior Based on Deep Forest [J]. Computer Science, 2019, 46(9): 190-194.
[14] ZHENG Cheng, HONG Tong-tong, XUE Man-yi. BLSTM_MLPCNN Model for Short Text Classification [J]. Computer Science, 2019, 46(6): 206-211.
[15] PEI Lan-zhen, ZHAO Ying-jun, WANG Zhe, LUO Yun-qian. Comparison of DGA Domain Detection Models Using Deep Learning [J]. Computer Science, 2019, 46(5): 111-115.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] LEI Li-hui and WANG Jing. Parallelization of LTL Model Checking Based on Possibility Measure[J]. Computer Science, 2018, 45(4): 71 -75 .
[2] SUN Qi, JIN Yan, HE Kun and XU Ling-xuan. Hybrid Evolutionary Algorithm for Solving Mixed Capacitated General Routing Problem[J]. Computer Science, 2018, 45(4): 76 -82 .
[3] ZHANG Jia-nan and XIAO Ming-yu. Approximation Algorithm for Weighted Mixed Domination Problem[J]. Computer Science, 2018, 45(4): 83 -88 .
[4] WU Jian-hui, HUANG Zhong-xiang, LI Wu, WU Jian-hui, PENG Xin and ZHANG Sheng. Robustness Optimization of Sequence Decision in Urban Road Construction[J]. Computer Science, 2018, 45(4): 89 -93 .
[5] SHI Wen-jun, WU Ji-gang and LUO Yu-chun. Fast and Efficient Scheduling Algorithms for Mobile Cloud Offloading[J]. Computer Science, 2018, 45(4): 94 -99 .
[6] ZHOU Yan-ping and YE Qiao-lin. L1-norm Distance Based Least Squares Twin Support Vector Machine[J]. Computer Science, 2018, 45(4): 100 -105 .
[7] LIU Bo-yi, TANG Xiang-yan and CHENG Jie-ren. Recognition Method for Corn Borer Based on Templates Matching in Muliple Growth Periods[J]. Computer Science, 2018, 45(4): 106 -111 .
[8] GENG Hai-jun, SHI Xin-gang, WANG Zhi-liang, YIN Xia and YIN Shao-ping. Energy-efficient Intra-domain Routing Algorithm Based on Directed Acyclic Graph[J]. Computer Science, 2018, 45(4): 112 -116 .
[9] CUI Qiong, LI Jian-hua, WANG Hong and NAN Ming-li. Resilience Analysis Model of Networked Command Information System Based on Node Repairability[J]. Computer Science, 2018, 45(4): 117 -121 .
[10] SHI Chao, XIE Zai-peng, LIU Han and LV Xin. Optimization of Container Deployment Strategy Based on Stable Matching[J]. Computer Science, 2018, 45(4): 131 -136 .