基于人工特征与深度特征的DGA域名检测算法

doi:10.11896/jsjkx.191000118

Abstract

Abstract: Nowadays,various families of malware use domain generation algorithms (DGAs) to generate a large number of pseudo-random domain names to connect to C&C (Command and Control) servers,in order to launch corresponding attacks.There are two existing methods to detect DGA domains.On the one hand,it is a machine learning method based on the randomness of DGA domain name to construct artificial features.This kind of algorithm has the problems of time-consuming and laborious artificial feature engineering and high false alarm rate and so on.On the other hand,LSTM,GRU and other deep learning technologies are used to learn the sequence relationship of DGA domain names.This kind of algorithm has a low detection accuracy for DGA domain names with low randomness.Therefore,this paper proposes a domain name generic feature extraction scheme,establishes a data set containing 41 DGA domain name families,and designs a detection algorithm based on artificial features and depth features that enhances the generalization ability of the model and improves the identification types of DGA domain families.Experimental results show that DGA domain name detection algorithm based on artificial features and depth features has achieved higheraccuracy and better generalization ability than traditional deep learning methods.

Key words: Domain generation algorithms, Domain name detection, Feature engineering, Long short-term memory

CLC Number:

TP393.0

HU Peng-cheng, DIAO Li-li, YE Hua, YANG Yan-lan. DGA Domains Detection Based on Artificial and Depth Features[J].Computer Science, 2020, 47(9): 311-317.

References

[1] KÜHRER M,ROSSOW C,HOLZ T.Paint it black:Evaluating the effectiveness of malware blacklists[C]//International Workshop on Recent Advances in Intrusion Detection.Cham:Sprin-ger,2014:1-21.
[2] ANTONAKAKIS M,PERDISCI R,NADJI Y,et al.FromThrow-Away Traffic to Bots:Detecting the Rise of DGA-Based Malware[C]//21th USENIX Security Symposium.2012.
[3] YADAV S,REDDY A K K,REDDY A L N,et al.Detecting Algorithmically Generated Malicious Domain Names[C]//Proceedings of the 10th ACM SIGCOMM Conference on Internet Measurement 2010.Melbourne,Australia,ACM,2010.
[4] KRISHNAN,TAYLOR T,MONROSE F,et al.Crossingthethreshold:Detecting network malfeasance via sequential hypothesis testing[C]//2013 43rd Annual IEEE/IFIP InternationalConference on Dependable Systems and Networks (DSN).IEEE Computer Society,2013.
[5] MOWBRAY M,HAGEN J.Finding Domain-Generation Algo-rithms by Looking at Length Distribution[C]//IEEE International Symposium on Software Reliability Engineering Workshops.IEEE,2014.
[6] WOODBRIDGE J,ANDERSON H S,AHUJA A,et al.Predicting domain generation algorithms with long short-term memory networks[J].arXiv:1611.00791,2016.
[7] LISON P,MAVROEIDIS V.Automatic detection of malware-generated domains with recurrent neural models[J].arXiv:1709.07102,2017.
[8] CHEN L H,CHEN H,FANG Y Q.Detecting Domain Genera-.tion Algorithm Based on Attention Mechanism.[J].Journal of east China University of Science and Technology (Natural Science Edition),2019(3).
[9] LIAO K,ZHAO Z,DOUPEA,et al.Behind closed doors:mea-surement and analysis of CryptoLocker ransoms in Bitcoin[C]//Electronic Crime Research.IEEE,2016.
[10] SULKOSWKI A J.Cyber-Extortion:Duties and Liabilities Related to the Elephant in the Server Room[J/OL].SSRN Electronic Journal.https://ssrn.com/abstract=955962.
[11] ATZENI A,DIAZ F,LOPEZ F,et al.The Rise of AndroidBanking Trojans[J].IEEE Potentials,2020,39(3):13-18.
[12] ALBANESIUSC.Ramnit computer worm compromises 45K facebook logins[J/OL].http://www.pcmag.com/article2/0.
[13] PLOHMANN D,YAKDAN K,KLATT M.A comprehsivemeasurement study of domain generatingmalware[C]//25th USENIX Security Symposium.Austin:Usenix,2016:263-278.
[14] Gibberish-Detector[OL].https://github.com/rrenaud/Gibberi-sh-Detector.
[15] DGA feature mining[OL].https://www.cnblogs.com/bonelee/p/7640055.html.
[16] LI H.Statistical learning methods [M].Beijing:Tsinghua University Press,2012.
[17] ROBINSON A J.An application of recurrent neural nets tophone probability estimation[J].IEEE Trans.on Neural Networks,1994,5(2):298-305.
[18] BENGIO Y,BOULANGER-LEWANDOWSKI N,PASCANUR.Advances in optimizing recurrent networks[C]//2013 IEEE International Conference on Acoustics,Speech and Signal Processing.IEEE,2013.
[19] GRAVES A.Long Short-Term Memory[M]//Supervised Sequence Labelling with Recurrent Neural Networks.2012.
[20] GERS F A,SCHRAUDOLPHN N,SCHMIDHUBER.Learning Precise Timing with LSTM Recurrent Networks[J].Journal of Machine Learning Research,2003,3(1):115-143.

Related Articles 15

[1]	WANG Xin-tong, WANG Xuan, SUN Zhi-xin. Network Traffic Anomaly Detection Method Based on Multi-scale Memory Residual Network [J]. Computer Science, 2022, 49(8): 314-322.
[2]	KANG Yan, XU Yu-long, KOU Yong-qi, XIE Si-yu, YANG Xue-kun, LI Hao. Drug-Drug Interaction Prediction Based on Transformer and LSTM [J]. Computer Science, 2022, 49(6A): 17-21.
[3]	WANG Shan, XU Chu-yi, SHI Chun-xiang, ZHANG Ying. Study on Cloud Classification Method of Satellite Cloud Images Based on CNN-LSTM [J]. Computer Science, 2022, 49(6A): 675-679.
[4]	WANG Fei, HUANG Tao, YANG Ye. Study on Machine Learning Algorithms for Life Prediction of IGBT Devices Based on Stacking Multi-model Fusion [J]. Computer Science, 2022, 49(6A): 784-789.
[5]	PAN Zhi-hao, ZENG Bi, LIAO Wen-xiong, WEI Peng-fei, WEN Song. Interactive Attention Graph Convolutional Networks for Aspect-based Sentiment Classification [J]. Computer Science, 2022, 49(3): 294-300.
[6]	LIU Meng-yang, WU Li-juan, LIANG Hui, DUAN Xu-lei, LIU Shang-qing, GAO Yi-bo. A Kind of High-precision LSTM-FC Atmospheric Contaminant Concentrations Forecasting Model [J]. Computer Science, 2021, 48(6A): 184-189.
[7]	DING Ling, XIANG Yang. Chinese Event Detection with Hierarchical and Multi-granularity Semantic Fusion [J]. Computer Science, 2021, 48(5): 202-208.
[8]	LIU Jia-chen, QIN Xiao-lin, ZHU Run-ze. Prediction of RFID Mobile Object Location Based on LSTM-Attention [J]. Computer Science, 2021, 48(3): 188-195.
[9]	LIU Qi, CHEN Hong-mei, LUO Chuan. Method for Prediction of Red Blood Cells Supply Based on Improved Grasshopper Optimization Algorithm [J]. Computer Science, 2021, 48(2): 224-230.
[10]	PENG Bin, LI Zheng, LIU Yong, WU Yong-hao. Automatic Code Comments Generation Method Based on Convolutional Neural Network [J]. Computer Science, 2021, 48(12): 117-124.
[11]	ZHANG Ning, FANG Jing-wen, ZHAO Yu-xuan. Bitcoin Price Forecast Based on Mixed LSTM Model [J]. Computer Science, 2021, 48(11A): 39-45.
[12]	LIU Xiao-xuan, JI Yi, LIU Chun-ping. Voiceprint Recognition Based on LSTM Neural Network [J]. Computer Science, 2021, 48(11A): 270-274.
[13]	ZHAO Jia-qi, WANG Han-zheng, ZHOU Yong, ZHANG Di, ZHOU Zi-yuan. Remote Sensing Image Description Generation Method Based on Attention and Multi-scale Feature Enhancement [J]. Computer Science, 2021, 48(1): 190-196.
[14]	ZHANG Yu-shuai, ZHAO Huan, LI Bo. Semantic Slot Filling Based on BERT and BiLSTM [J]. Computer Science, 2021, 48(1): 247-252.
[15]	CUI Tong-tong, WANG Gui-ling, GAO Jing. Ship Trajectory Classification Method Based on 1DCNN-LSTM [J]. Computer Science, 2020, 47(9): 175-184.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

DGA Domains Detection Based on Artificial and Depth Features

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0