MACSPMD:基于恶意API调用序列模式挖掘的恶意代码检测

doi:10.11896/j.issn.1002-137X.2018.05.022

摘要/Abstract

摘要： 基于动态分析的恶意代码检测方法由于能有效对抗恶意代码的多态和代码混淆技术,而且可以检测新的未知恶意代码等,因此得到了研究者的青睐。在这种情况下,恶意代码的编写者通过在恶意代码中嵌入大量反检测功能来逃避现有恶意代码动态检测方法的检测。针对该问题,提出了基于恶意API调用序列模式挖掘的恶意代码检测方法MACSPMD。首先,使用真机模拟恶意代码的实际运行环境来获取文件的动态API调用序列；其次,引入面向目标关联挖掘的概念,以挖掘出能够代表潜在恶意行为模式的恶意API调用序列模式；最后,将挖掘到的恶意API调用序列模式作为异常行为特征进行恶意代码的检测。基于真实数据集的实验结果表明,MACSPMD对未知和逃避型恶意代码进行检测的准确率分别达到了94.55%和97.73%,比其他基于API调用数据的恶意代码检测方法的准确率分别提高了2.47%和2.66%,且挖掘过程消耗的时间更少。因此,MACSPMD能有效检测包括逃避型在内的已知和未知恶意代码。

关键词: 恶意代码检测,逃避型恶意代码,序列模式挖掘,API调用序列,分类

Abstract: Researchers give preference to dynamic analysis based malware detection methods with capability of nullifying the effects of polymorphism and obfuscation on malware and detecting new and unseen malwares.In this case,malware authors embed numerous anti-detection functions in to malware to evade the detection of existing dynamic malware detection methods.To solve this problem,a malware detection method MACSPMD based on malicious API call sequential pattern mining was proposed.Firstly,dynamic API call sequences of the files are gotten by real machine which simu-lates the actual running environment of the malware.Secondly,the malicious API call sequence patterns that can represent the potential malicious behavior patterns are mined by introducing the concept of objective-oriented association mining.Finally,the malicious API call sequences are used as abnormal behavior feature to detect malware.The experimental results based on real data set show that MACSPMD achieves 94.55% and 97.73% of detection accuracy on unknown and evasive malware respectively.Compared with other malware detection methods based on API call data,the detection accuracy of unknown and evasive malware is improved by 2.47% and 2.66% respectively,and the time consumed in the mining process is less.MACSPMD can effectively detect known and unknown malware,including escape type.

Key words: Malware detection,Evasive-malware,Sequetial pattern mining,API call sequence,Classification

荣俸萍,方勇,左政,刘亮. MACSPMD:基于恶意API调用序列模式挖掘的恶意代码检测[J]. 计算机科学, 2018, 45(5): 131-138. https://doi.org/10.11896/j.issn.1002-137X.2018.05.022

RONG Feng-ping, FANG Yong, ZUO Zheng and LIU Liang. MACSPMD:Malicious API Call Sequential Pattern Mining Based Malware Detection[J]. Computer Science, 2018, 45(5): 131-138. https://doi.org/10.11896/j.issn.1002-137X.2018.05.022

参考文献

[1] PHILIP O,SEZER S,MCLAUGHLIN K.Obfuscation:TheHidden Malware[J].IEEE Security & Privacy,2011,9(5):41-47.
[2] ROUNDY K A,MILLER B P.Binary-code obfuscations in pre-valent packer tools[J].Acm Computing Surveys,2013,46(1):1-32.
[3] MURAD K,SHIRAZI N U H,ZIKRIA Y B,et al.Evading Virus Detection Using Code Obfuscation[M]∥Future Generation Information Technology.Springer Berlin Heidelberg,2010:394-401.
[4] WU D F,WANG C G,HAO X W.Study on metamorphic technique of malware[J].Computer Applications and Software,2012,29(3):74-77.(in Chinese) 吴丹飞,王春刚,郝兴伟.恶意代码的变形技术研究[J].计算机应用与软件,2012,29(3):74-77.
[5] EGELE M,SCHOLTE T,KIRDA E,et al.A survey on automated dynamic malware-analysis techniques and tools[J].ACM Computing Surveys,2008,44(2):1-42.
[6] SRIKANT R,AFRAWAL R.Mining sequential patterns:Ge-neralizations and performance improvements[C]∥International Conference on Extending Database Technology.Springer Berlin Heidelberg,1996:1-17.
[7] SHEN Y D,ZHANG Z,YANG Q.Objective-Oriented Utility-Based Association Mining[C]∥IEEE International Conference on Data Mining,2002.IEEE,2002:426-433.
[8] MASUD M M,KHAN L,THURAISINGHAM B.A scalable multi-level feature extraction technique to detect malicious executables[J].Information Systems Frontiers,2008,10(1):33-45.
[9] YE Y,WANG D,LI T,et al.An intelligent PE-malware detection system based on association mining[J].Journal in Computer Virology,2008,4(4):323-334.
[10] FAN Y,YE Y,CHEN L.Malicious sequential pattern mining for automatic malware detection[J].Expert Systems with Applications,2016,52(C):16-25.
[11] AHMADI M,SAMI A,RAHIMI H,et al.Malware detection by behavioural sequential patterns[J].Computer Fraud & Security,2013,2013(8):11-19.
[12] HAN L S,GAO K L,ZHAO B H,et al.Behavior detection of malware based on combination of API function and its parameters [J].Application Research of Computers,2013,30(11):3407-3410.(in Chinese) 韩兰胜,高昆仑,赵保华,等.基于API函数及其参数相结合的恶意软件行为检测[J].计算机应用研究,2013,30(11):3407-3410.
[13] LI M,JIA X Q,WANG R,et al.A feature selection and modelling method for malicious code [J].Computer Applications and Software,2015,32(8):266-271.(in Chinese) 李盟,贾晓启,王蕊,等.一种恶意代码特征选取和建模方法[J].计算机应用与软件,2015,32(8):266-271.
[14] BALZAROTTI D,COVA M,KARLBERGER C,et al.Efficient Detection of Split Personalities in Malware[C]∥Network and Distributed System Security Symposium(NDSS 2010).San Diego,California,USA,DBLP,2010.
[15] KIRAT D,VIGNA G,KRUEGEL C.BareCloud:Bare-metalAnalysis-based Evasive Malware Detection[C]∥USENIX Security Symposium.2014:287-301.
[16] NAVAL S,LAXMI V,GAUR M S,et al.Environment-Reactive Malware Behavior:Detection and Categorization[M]∥Data Privacy Management,Autonomous Spontaneous Security,and Security Assurance.Springer International Publishing,2015:167-182.
[17] DINABURG A,ROYAL P,SHARIF M,et al.Ether:malware analysis via hardware virtualization extensions[C]∥ACM Conference on Computer and Communications Security(CCS 2008).Alexandria,Virginia,USA,DBLP,2008:51-62.
[18] BREIMAN L.Random forests[J].Machine Learning,2001,45(1):5-32.
[19] Cuckoo Sandbox[EB/OL].[2017-06-17].http://www.cuc-koosandbox.org.
[20] Pafish[EB/OL].[2017-06-17].https://github.com/a0rtega/pafish.
[21] Anubis[EB/OL].[2017-06-17].http://anubis.iseclab.org.
[22] WILLEMS C,HOLZ T,FREILING F.Toward Automated Dynamic Malware Analysis Using CWSandbox[J].IEEE Security &Privacy,2007,5(2):32-39.
[23] Clonezilla[EB/OL].[2017-06-17].http://www.clonezilla.org.
[24] Acronis True Image[EB/OL].[2017-06-17].http://www.acronis.com.
[25] NortonGhost[EB/OL].[2017-06-17].http://www.symantec-norton.com/Norton_ Ghost_15.0_p115.aspx.
[26] INetSim[EB/OL].[2017-06-17].http://www.inetsim.org.
[27] VirusShare[EB/OL].[2017-06-17].https://virusshare.com.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed