基于知识蒸馏的恶意代码家族检测方法

doi:10.11896/jsjkx.200900099

Computer Science ›› 2021, Vol. 48 ›› Issue (1): 280-286.doi: 10.11896/jsjkx.200900099

• Information Security • Previous Articles Next Articles

Malicious Code Family Detection Method Based on Knowledge Distillation

WANG Run-zheng¹, GAO Jian^1,2, HUANG Shu-hua^1,2, TONG Xin¹

1 College of Information and Cyber Security,People's Public Security University of China,Beijing 100038,China
2 Key Laboratory of Safety Precautions and Risk Assessment,Beijing 102623,China

Received:2020-09-13 Revised:2020-11-10 Online:2021-01-15 Published:2021-01-15
About author:WANG Run-zheng,born in 1996,postgraduate,is a member of China Computer Federation.His main research interests include cyber security and malware.
GAO Jian,born in 1982,Ph.D.His main research interests include cyber security,malware and botnet.
Supported by:
National Key R&D Program of China,Key Program of the National Social Science Foundation of China(20AZD114),2020 Special Project of Science and Technology Strengthening Police of Ministry of Public Security(2020GABJC01) and Basic Scien-tific Research Operating Expenses of the People's Public Security University of China(2019JKF218).

Abstract

Abstract: In recent years,the variety of malicious code emerges in an endless stream,and malware is more covert and persistent.It is urgent to identify malicious samples by rapid and effective detection methods.Aiming at the present situation,a method of malicious code family detection based on knowledge distillation is proposed.The model decompiles malicious samples in reverse and transforms binary text into images by malicious code visualization technology,so as to avoid dependence on traditional feature engineering.In the teacher network model,residual network is used to extract the deep-seated features of image texture,and channel domain attention mechanism is introduced to extract the key information from the image according to the change of channel weight.In order to speed up the identification efficiency of the samples to be tested and solve the problems of large parameters and serious consumption of computing resources based on deep neural network detection model,the teacher network model is used to guide the training of the student network model.The results show that the student network maintains the detection effect of malicious code family on the basis of reducing the complexity of the model.It is conducive to the detection of batch samples and the deployment of mobile terminal.

Key words: Attention mechanism, Knowledge distillation, Malicious family, Residual network

CLC Number:

TP309

WANG Run-zheng, GAO Jian, HUANG Shu-hua, TONG Xin. Malicious Code Family Detection Method Based on Knowledge Distillation[J].Computer Science, 2021, 48(1): 280-286.

References

[1] CHEN J J,PENG B Z,WU P Z.Method for detecting malicious code based on dynamic behavior and machine learning[J/OL].Computer Engineering.[2020-06-06].https://doi.org/10.19678/j.issn.1000-3428.0056409.
[2] ZHAO C R,ZHANG W J,FANG Y,et al.Malware detection based on semantic API dependency graph[J].Journal of Sichuan University(Natural Science Edition),2020,57(3):488-494.
[3] MOHANASRUTHI V,CHAKRABORTY A,THANUDAS B,et al.An Efficient Malware Detection Technique using Complex Network-based Approach[C]//2020 National Conference on Communications (NCC).2020.
[4] NARAYANAN B N,DAVULURU V S P.Ensemble Malware Classification System using Deep Neural Networks[J].Electro-nics,2020,9(5):721.
[5] HU J W,CHE X,ZHOU M,et al.Incremental clustering me-thod based on Gaussian mixture model to identify malware family[J].Journal on Communications,2019,40(6):148-159.
[6] ZENG Y Q,ZHANG L L,ZHANG R N,et al.Malware Family Classification Model Based on MobileNet[J].Computer Engineering,2020,46(4):162-168.
[7] SUN B W,ZHANG P,CHENG M Y,et al.Malware detection method based on enhanced code images[J].Journal of Tsinghua University(Science and Technology),2020,60(5):386-392.
[8] VASAN D,ALAZAB M,WASSAN S,et al.Image-Based Malware Classification using Ensemble of CNN Architectures(IMCEC)[J].Computers & Security,2020,92:101748.
[9] GHOUTI L.Malware Classification Using Compact Image Features and Multiclass Support Vector Machines[J].IET Information Security,2020,14(4):419-429.
[10] JAIN M,ANDREOPOULOS W,STAMP M.Convolutionalneural networks and extreme learning machines for malware classification[J].Journal of Computer Virology and Hacking Techniques,2020,16(3):229-244.
[11] REN Z J,CHEN G,LU W K.Malware visualization methods based on deep convolution neural networks[J].Multimedia Tools and Applications,2020,79(3):1-19.
[12] COHEN A,NISSIM N,ELOVICI Y.MalJPEG:Machine Learning Based Solution for the Detection of Malicious JPEG Images[J].IEEE Access,2020,8:19997-20011.
[13] AZAB A,KHASAWNEH M.MSIC:Malware Spectrogram Image Classification[J].IEEE Access,2020,8:102007-102021.
[14] CHEN J,JIA X,ZHAO C,et al.Using the Rgb Image of Machine Code to Classify the Malware[C]// 2020 IEEE 5th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA).IEEE,2020.
[15] HINTON G,VINYALS O,DEAN J.Distilling the Knowledgein a Neural Network[J].Computer Ence,2015,14(7):38-39.
[16] FURLANELLO T,LIPTON Z C,TSCHANNEN M,et al.Born again neural networks [C]// International Conference on Machine Learning.2018:1607-1616.
[17] GAO M Y,SHEN Y J,LI Q Q,et al.Residual knowledge distillation [EB/OL].(2020-02-21)[2020-07-04].https://arxiv.org/pdf/2002.09168.pdf.
[18] NATARAJ L,KARTHIKEYAN S,JACOB G,et al.Malwareimages:visualization and automatic classification[C]//Procee-dings of the 8th International Symposium on Visualization for Cyber Security.New York,USA:ACM Press,2011:1-7.
[19] HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778.
[20] HU J,SHEN L,ALBANIE S,et al.Squeeze-and-Excitation Networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2020,42(8):2011-2023.
[21] RONEN R,RADU M,FEUERSTEIN C,et al.Microsoft Mal-ware Classification Challenge[EB/OL].[2018-02-22].https://arxiv.org/pdf/1802.10135.pdf.

Related Articles 15

[1]	ZHOU Fang-quan, CHENG Wei-qing. Sequence Recommendation Based on Global Enhanced Graph Neural Network [J]. Computer Science, 2022, 49(9): 55-63.
[2]	DAI Yu, XU Lin-feng. Cross-image Text Reading Method Based on Text Line Matching [J]. Computer Science, 2022, 49(9): 139-145.
[3]	ZHOU Le-yuan, ZHANG Jian-hua, YUAN Tian-tian, CHEN Sheng-yong. Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion [J]. Computer Science, 2022, 49(9): 155-161.
[4]	XIONG Li-qin, CAO Lei, LAI Jun, CHEN Xi-liang. Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization [J]. Computer Science, 2022, 49(9): 172-182.
[5]	RAO Zhi-shuang, JIA Zhen, ZHANG Fan, LI Tian-rui. Key-Value Relational Memory Networks for Question Answering over Knowledge Graph [J]. Computer Science, 2022, 49(9): 202-207.
[6]	ZHU Cheng-zhang, HUANG Jia-er, XIAO Ya-long, WANG Han, ZOU Bei-ji. Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism [J]. Computer Science, 2022, 49(8): 113-119.
[7]	SUN Qi, JI Gen-lin, ZHANG Jie. Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection [J]. Computer Science, 2022, 49(8): 172-177.
[8]	YAN Jia-dan, JIA Cai-yan. Text Classification Method Based on Information Fusion of Dual-graph Neural Network [J]. Computer Science, 2022, 49(8): 230-236.
[9]	WANG Xin-tong, WANG Xuan, SUN Zhi-xin. Network Traffic Anomaly Detection Method Based on Multi-scale Memory Residual Network [J]. Computer Science, 2022, 49(8): 314-322.
[10]	JIANG Meng-han, LI Shao-mei, ZHENG Hong-hao, ZHANG Jian-peng. Rumor Detection Model Based on Improved Position Embedding [J]. Computer Science, 2022, 49(8): 330-335.
[11]	WANG Ming, PENG Jian, HUANG Fei-hu. Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction [J]. Computer Science, 2022, 49(8): 40-48.
[12]	JIN Fang-yan, WANG Xiu-li. Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM [J]. Computer Science, 2022, 49(7): 179-186.
[13]	XIONG Luo-geng, ZHENG Shang, ZOU Hai-tao, YU Hua-long, GAO Shang. Software Self-admitted Technical Debt Identification with Bidirectional Gate Recurrent Unit and Attention Mechanism [J]. Computer Science, 2022, 49(7): 212-219.
[14]	PENG Shuang, WU Jiang-jiang, CHEN Hao, DU Chun, LI Jun. Satellite Onboard Observation Task Planning Based on Attention Neural Network [J]. Computer Science, 2022, 49(7): 242-247.
[15]	ZHANG Ying-tao, ZHANG Jie, ZHANG Rui, ZHANG Wen-qiang. Photorealistic Style Transfer Guided by Global Information [J]. Computer Science, 2022, 49(7): 100-105.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Malicious Code Family Detection Method Based on Knowledge Distillation

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0