计算机科学 ›› 2025, Vol. 52 ›› Issue (11A): 250200120-11.doi: 10.11896/jsjkx.250200120
陈维国1, 张高峰2, 贾晟1, 徐本柱2, 郑利平1
CHEN Weiguo1, ZHANG Gaofeng2, JIA Sheng1, XU Benzhu2, ZHENG Liping1
摘要: 在安卓应用安全研究领域的静态分析中,一种有效的方式是使用逆向工程工具对应用程序进行反编译,并从反编译后的代码文件中提取函数调用图(Function Call Graph,FCG)作为恶意软件识别的主要特征,特别是基于敏感API的FCG调用子图已经得到广泛验证。然而,现有的此类基于敏感API的研究工作大多依赖较早的敏感API集,没有随着系统API迭代继续更新。通过实验可以发现,使用传统的敏感API集从应用的函数调用图(FCG)中提取特征节点时,很多情况下无法获取到所需的特征节点。例如随着安卓系统的迭代更新,出现显著的API调整和更换,或者使用反射机制(Reflect)相关技术可以动态隐式调用系统API。对此,文中根据最新的安卓应用综合研究框架,提出了一种基于MOBSF_rule提取FCG子图的安卓恶意软件检测方法。该方法首先从应用程序反编译的代码文件中生成函数调用图(FCG);然后利用MOBSF_rule规则集提取特征节点,生成包含这些特征节点的五点图、六点图和七点图,统计不同构型子图的出现频率;最后把频率矩阵输入机器学习方法中进行训练、推理。相比现有的敏感API集,所提方法有如下优势。1)MOBSF_rule过滤规则集在提取特征节点方面表现出色,能够有效提取包括反射机制、组件交互、签名验证、网络通信和客户端/服务器(C/S)架构通信等关键API特征,对比传统敏感API集,在最新恶意软件数据集中特征提取有效率提升了69.765%。2)MOBSF_rule规则集在不同时间标签下提取特征节点的能力表现出色,具有较强的稳定性。它不仅能够适应安卓系统的持续更新,还能在不同版本之间保持高度一致的特征提取能力。2012-2022年期间,相比传统敏感API集,MOBSF_rule规则集的特征提取有效率的多年总体方差降低了98.747%。3)采用了Stacking集成学习方法,对比随机森林集成学习方法和多层感知机方法,准确率提升了4.32%。
中图分类号:
| [1]AU K W Y,ZHOU Y F,HUANG Z,et al.PScout:analyzing the Android permission specification [J].Proceedings of the 2012 ACM Conference on Computer and communications security,2012:217-228. [2]ARZT S,RASTHOFER S,BODDEN E.SuSi:A tool for thefully automated classification and categorization of android sources and sinks [R].Darmstadt:University of Darmstadt,2013. [3]FAN M,LIU J,LUO X,et al.Android malware familial classification and representative sample selection via frequent subgraph analysis [J].IEEE Transactions on Information Forensics and Security,2018,13(8):1890-1905. [4]OU F,XU J.S3Feature:A static sensitive subgraph-based feature for android malware detection [J].Computers & Security,2022,112:102513. [5]LIU Z,WANG R,JAPKOWICZ N,et al.SeGDroid:An Android malware detection method based on sensitive function call graph learning [J].Expert Systems with Applications,2024,235:121125. [6]ARP D,SPREITZENBARTH M,HUBNER M,et al.Drebin:Effective and explainable detection of android malware in your pocket [C]//Proceedings of the Ndss 2014.2014:23-26. [7]BRUZZESE R.Building visual malware dataset using Vir-usShare data and comparing machine learning baseline model to CoAtNet for malware classification [C]//Proceedings of the 2024 16th International Conference on Machine Learning and Computing.2024:185-193. [8]QURESHI M A,GILL A M,SADAF M.APK Insight:Revolutionizing Forensic Analysis with a User-Friendly Approach [C]//2024 International Conference on Engineering & Computing Technologies(ICECT).IEEE,2024:1-6. [9]BHOOSHAN P,SONKAR N.Comprehensive Android Malware Detection:Leveraging Machine Learning and Sandboxing Techniques Through Static and Dynamic Analysis [C]//2024 IEEE 21st International Conference on Mobile Ad-Hoc and Smart Systems(MASS).IEEE,2024:580-585. [10]HALL-ANDERSEN M,SIMKIN M,WAGNER B.FRIDA:Data availabilitysampling from FRI [C]//Annual International Cryptology Conference.Cham:Springer Nature Switzerland,2024:289-324. [11]KHAN S A,ADNAN M,ALI A,et al.An Android applications vulnerability analysis usingMobSF [C]//International Confe-rence on Engineering & Computing Technologies(ICECT 2024).IEEE,2024:1-7. [12]YANG C,XU Z,GU G,et al.Droidminer:Automated mining and characterization of fine-grained malicious behaviors in android applications [C]//19th European Symposium on Research in Computer Security(ESORICS 2014).Wroclaw,Poland,Part I.Wroclaw:Springer International Publishing,2014:163-182. [13]YANG W,XIAO X,ANDOW B,et al.Appcontext:Differentiating malicious and benign mobile app behaviors using context [C]//Proceedings of the 37th IEEE/ACM International Conference on Software Engineering.IEEE,2015:303-313. [14]PENDLEBURY F,PIERAZZI F,JORDANEY R,et al.TES-SERACT:Eliminating experimental bias in malware classification across space and time [C]//Proceedings of the 28th USENIX Conference on Security Symposium(SEC ’19).USA:USENIX Association,2019:729-746. [15]ZHANG X,ZHANG Y,ZHONG M,et al.Enhancing state-of-the-art classifiers with API semantics to detect evolvedAndroid malware [C]//Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security.New York:ACM,2020:757-770. [16]CAI M,JIANG Y,GAO C,et al.Learning features from en-hanced function call graphs for Android malware detection [J].Neurocomputing,2021,423:301-307. [17]LO WW,LAYEGHY S,SARHAN M,et al.Graph neural network-based Android malware classification with jumping knowledge [C]//IEEE Conference on Dependable and Secure Computing(DSC 2022).IEEE,2022:1-9. [18]WANG L,WANG H,HE R,et al.MalRadar:Demystifying Android malware in the new era [J].Proceedings of the ACM on Measurement and Analysis of Computing Systems,2022,6(2):1-27. [19]AZEEM M,KHAN D,IFTIKHAR S,et al.Analyzing and comparing the effectiveness of malware detection:A study of machine learning approaches [J].Heliyon,2024,10(1). [20]BAI KV,THIRUMARAN M.Hybrid Deep Learning and Behavioral Analysis for Enhanced Malware Detection in Banking [C]//8th International Conference on Electronics,Communication and Aerospace Technology(ICECA 2024).IEEE,2024:1168-1173. [21]ALLIX K,BISSYANDÉ T F,KLEIN J,et al.Androzoo:Collecting millions of Android apps for the research community [C]//Proceedings of the 13th International Conference on Mining Software Repositories.2016:468-471. [22]SUN Z,WANG G,LI P,et al.An improved random forest based on the classification accuracy and correlation measurement of decision trees [J].Expert Systems with Applications,2024,237:121549. [23]DESAI M,SHAH M.An anatomizationon breast cancer detection and diagnosis employing multi-layer perceptron neural network(MLP) and Convolutional neural network(CNN) [J].Clinical eHealth,2021,4:1-1. [24]KOOPIALIPOOR M,ASTERIS P G,MOHAMMED A S,et al.Introducing stacking machine learning approaches for the prediction of rock deformation [J].Transportation Geotechnics,2022,34:100756. |
|
||