Computer Science ›› 2019, Vol. 46 ›› Issue (7): 86-90.doi: 10.11896/j.issn.1002-137X.2019.07.013

• Information Security • Previous Articles     Next Articles

Study on Malicious Program Detection Based on Recurrent Neural Network

WANG Le-le1,WANG Bin-qiang1,LIU Jian-gang2,ZHANG Jian-hui1,MIAO Qi-guang3   

  1. (National Digital Switching System Engineering and Technological Research Center,Zhengzhou 450000,China)1
    (Nanjing Information Technology Institute,Nanjing 210000,China)2
    (Department of Computer Science,Xidian University,Xi’an 710071,China)3
  • Received:2018-09-08 Online:2019-07-15 Published:2019-07-15

Abstract: In view of the low efficiency of traditional malicious program detection and the lack of automatic analysis of malicious programs,this paper studied to use recurrent neural networks to detect and classify malicious programs in deep learning environment.First,the QEMU is used to capture the API and its parameter sequence that are called when the malicious program runs,after the behavior abstraction,the characteristic sequence of the malicious program is formed.Then the feature sequence is mapped to a fixed length word vector by using a logarithmic bilinear model (HLBL),and these word vectors are synthesized into an input matrix of a recursive neural network (RNN).Through the training of the recursive neural network model,a multi-layer semantic aggregation model of malicious programs is established to complete the classification detection of malicious programs.The experimental data show that the recursive neural network model can detect malicious program effectively in the classification of malicious program detection.Compared with the traditional machine learning algorithm,its detection rate has increased by 17%.In particular,when the concept of tensors is introduced,after using the Recursive Neural Tensor Network (RNTN) model,the detection rate is increased by 7% compared to the RNN model by reducing the overall number of parameters and the amount of calculations.The experimental data fully show that the recursive neural network model can complete the detection and classification of malicious programs in big data environment.

Key words: Quick emulator, Hierarchical log-bilinear language model, Word vector, Recursive neural network, Multi-levelsemantic aggregate model

CLC Number: 

  • TP393
[1] 360互联网安全中心.2018年上半年互联网安全报告[EB/OL].www.anquanke.com/post/id/156689.
[2] HINTON G,OSINDERO S,WELLING M,et al.Unsupervised discovery of nonlinear structure using contrastive backpropagation [J].Cognitive Science,2006,30(4):725-731.
[3] LV Y,DUAN Y,KANG W,et al.Traffic Flow Prediction With Big Data:A Deep Learning Approach [J].IEEE Transactions on Intelligent Transportation Systems,2015,16(2):865-873.
[4] CUI Z,XUE F,CAI X,et al.Detection of Malicious Code Va- riants Based on Deep Learning [J].IEEE Transactions on Industrial Informatics,2018,14(7):3187-3196.
[5] DING Y,ZHU S.Malware detection based on deep learning algorithm [J].Neural Computing & Applications,2017(1):1-12.
[6] IDIKA N,MATHUR A P.A survey of malware detection techniques[R].Purdue University,2007.
[7] PEREVOZCHIKOV V A,SHAYMARDANOV T A,CHU- GUNKOV I V.New techniques of malware detection using FTP Honeypot systems[C]∥Young Researchers in Electrical and Electronic Engineering.IEEE,2017:204-207.
[8] YE Y,LI T,ADJEROH D,et al.A survey on malware detection using data mining techniques [J].ACM Computing Surveys (CSUR),2017,50(3):1-40.
[9] MAHINDRU A,SINGH P.Dynamic Permissions based An- droid Malware Detection using Machine Learning Techniques[C]∥Innovations in Software Engineering Conference.ACM,2017:202-210.
[10] BELLARD F.QEMU,a fast and portable dynamic translator [C]∥Conference on Usenix Technical Conference.USENIX Association,2005:41.
[11] HINTON G E.Learning distributed representations of concepts [C]∥Eighth Conference of the Cognitive Science Society.1989.
[12] BENGIO Y,VINCENT P,JANVIN C.A neural probabilistic language model [J].Journal of Machine Learning Research,2003,3(6):1137-1155.
[13] MNIH A,HINTON G.Three new graphical models for statistical language modelling[C]∥International Conference on Machine Learning.ACM,2007:641-648.
[14] MNIH A,HINTON G.A scalable hierarchical distributed language model[C]∥International Conference on Neural Information Processing Systems.Curran Associates Inc.2008:1081-1088.
[15] PENNINGTON J,SOCHER R,MANNING C.Glove:Global Vectors for Word Representation[C]∥Conference on Empirical Methods in Natural Language Processing.2014:1532-1543.
[16] SOCHER R,MANNING C D,NG A Y.Learning continuous phrase representations and syntactic parsing with recursive neural networks[C]∥Proceedings of the NIPS-2010 Deep Learning and Unsupervised Feature Learning Workshop.2010:1-9.
[17] SOCHER R,PERELYGIN A,WU J,et al.Recursive deep models for semantic compositionality over a sentiment treebank[C]∥Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing.2013:1631-1642.
[1] ZU Shi-cheng, WANG Xiu-lai, CAO Yang, ZHANG Yu-tao andLIANG Shan. Resume Parsing Based on Novel Text Block Segmentation Methodology [J]. Computer Science, 2020, 47(6A): 95-101.
[2] JING Li, LI Man-man, HE Ting-ting. Sentiment Classification of Network Reviews Combining Extended Dictionary and Self-supervised Learning [J]. Computer Science, 2020, 47(11A): 78-82.
[3] LI Zhou-jun,WANG Chang-bao. Survey on Deep-learning-based Machine Reading Comprehension [J]. Computer Science, 2019, 46(7): 7-12.
[4] SHAN Yi-dong, WANG Heng-jun, HUANG He, YAN Qian. Study on Named Entity Recognition Model Based on Attention Mechanism——Taking Military Text as Example [J]. Computer Science, 2019, 46(6A): 111-114.
[5] ZHENG Cheng, HONG Tong-tong, XUE Man-yi. BLSTM_MLPCNN Model for Short Text Classification [J]. Computer Science, 2019, 46(6): 206-211.
[6] WU Chen, YUAN Yu-wei, WANG Hong-wei, LIU Yu, LIU Si-tong, QUAN Ji-cheng. Word Vectors Fusion Based Remote Sensing Scenes Zero-shot Classification Algorithm [J]. Computer Science, 2019, 46(12): 286-291.
[7] ZHANG Xu-dong, DU Jia-hao, HUANG Yu-fang, SHI Dong-xian, MIAO Yong-wei. Time Series Analysis Based on MSH-LSTM [J]. Computer Science, 2019, 46(11A): 52-57.
[8] ZHENG Cheng, XUE Man-yi, HONG Tong-tong, SONG Fei-bao. DC-BiGRU_CNN Model for Short-text Classification [J]. Computer Science, 2019, 46(11): 186-192.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] LEI Li-hui and WANG Jing. Parallelization of LTL Model Checking Based on Possibility Measure[J]. Computer Science, 2018, 45(4): 71 -75 .
[2] SUN Qi, JIN Yan, HE Kun and XU Ling-xuan. Hybrid Evolutionary Algorithm for Solving Mixed Capacitated General Routing Problem[J]. Computer Science, 2018, 45(4): 76 -82 .
[3] ZHANG Jia-nan and XIAO Ming-yu. Approximation Algorithm for Weighted Mixed Domination Problem[J]. Computer Science, 2018, 45(4): 83 -88 .
[4] WU Jian-hui, HUANG Zhong-xiang, LI Wu, WU Jian-hui, PENG Xin and ZHANG Sheng. Robustness Optimization of Sequence Decision in Urban Road Construction[J]. Computer Science, 2018, 45(4): 89 -93 .
[5] SHI Wen-jun, WU Ji-gang and LUO Yu-chun. Fast and Efficient Scheduling Algorithms for Mobile Cloud Offloading[J]. Computer Science, 2018, 45(4): 94 -99 .
[6] ZHOU Yan-ping and YE Qiao-lin. L1-norm Distance Based Least Squares Twin Support Vector Machine[J]. Computer Science, 2018, 45(4): 100 -105 .
[7] LIU Bo-yi, TANG Xiang-yan and CHENG Jie-ren. Recognition Method for Corn Borer Based on Templates Matching in Muliple Growth Periods[J]. Computer Science, 2018, 45(4): 106 -111 .
[8] GENG Hai-jun, SHI Xin-gang, WANG Zhi-liang, YIN Xia and YIN Shao-ping. Energy-efficient Intra-domain Routing Algorithm Based on Directed Acyclic Graph[J]. Computer Science, 2018, 45(4): 112 -116 .
[9] CUI Qiong, LI Jian-hua, WANG Hong and NAN Ming-li. Resilience Analysis Model of Networked Command Information System Based on Node Repairability[J]. Computer Science, 2018, 45(4): 117 -121 .
[10] WANG Zhen-chao, HOU Huan-huan and LIAN Rui. Path Optimization Scheme for Restraining Degree of Disorder in CMT[J]. Computer Science, 2018, 45(4): 122 -125 .