Computer Science ›› 2025, Vol. 52 ›› Issue (11A): 250200120-11.doi: 10.11896/jsjkx.250200120

• Information Security • Previous Articles     Next Articles

MOBSF_rule Based Android Malware Detection Method

CHEN Weiguo1, ZHANG Gaofeng2, JIA Sheng1, XU Benzhu2, ZHENG Liping1   

  1. 1 School of Computer Science and Information Engineering,Hefei University of Technology,Hefei 230601,China
    2 School of Software,Hefei University of Technology,Hefei 230601,China
  • Online:2025-11-15 Published:2025-11-10
  • Supported by:
    National Key R&D Program of China(2022YFC3900800).

Abstract: In the field of Android application security research,a highly effective method within static analysis involves utilizing reverse engineering tools to decompile the application and subsequently extracting the Function Call Graph(FCG) from the decompiled code files,which serves as a primary feature for malware identification.Notably,FCG subgraphs based on sensitive APIs have been widely validated.However,the majority of existing research efforts in this area rely on older sets of sensitive APIs and have not continued to update with the iteration of system APIs.Through experimentation,it has been discovered that when using traditional sensitive API sets to extract feature nodes from the application’s FCG,many cases fail to obtain the desired feature nodes.For instance,with the iterative updates of the Android system,there are significant API adjustments and replacements,or dynamic implicit invocation of system APIs can be achieved using reflection mechanism(Reflect)-related technologies.In response to this,based on the latest comprehensive research framework for Android applications,this paper proposes an Android malware detection method that extracts FCG subgraphs using MOBSF_rule.The method first generates the FCG from the decompiled code files of the application.Then,it utilizes the MOBSF_rule set to extract feature nodes,generating five-node,six-node,and seven-node graphs containing these feature nodes,and statistically analyzing the occurrence frequency of different configuration subgraphs.Finally,the frequency matrix is input into the machine learning method for training and inference.Compared to existing sensitive API sets,the proposed method has the following advantages.1)The MOBSF_rule filtering rule set demonstrates outstanding performance in extracting feature nodes,effectively extracting key API features including reflection mechanisms,component interactions,signature verification,network communication,and client/server(C/S) architecture communication.Compared to traditional sensitive API sets,the effective rate of feature extraction in the latest malware datasets has increased by 69.765%.2)The MOBSF_rule set shows excellent capability in extracting feature nodes across different time tags,exhibiting strong stability.It can not only adapt to the continuous updates of the Android system but also maintain a highly consistent feature extraction capability across different versions.Between 2012 and 2022,compared to traditional sensitive API sets,the overall variance in feature extraction effectiveness over multiple years decreases by 98.747%.3)The method employs the Stacking ensemble learning approach,and compared to the random forest ensemble learning method and the multilayer perceptron method,the accuracy rate has increased by 4.32%.

Key words: Android, Function call graph, Sensitive API, Reflection mechanisms, Ensemble Learning

CLC Number: 

  • TP181
[1]AU K W Y,ZHOU Y F,HUANG Z,et al.PScout:analyzing the Android permission specification [J].Proceedings of the 2012 ACM Conference on Computer and communications security,2012:217-228.
[2]ARZT S,RASTHOFER S,BODDEN E.SuSi:A tool for thefully automated classification and categorization of android sources and sinks [R].Darmstadt:University of Darmstadt,2013.
[3]FAN M,LIU J,LUO X,et al.Android malware familial classification and representative sample selection via frequent subgraph analysis [J].IEEE Transactions on Information Forensics and Security,2018,13(8):1890-1905.
[4]OU F,XU J.S3Feature:A static sensitive subgraph-based feature for android malware detection [J].Computers & Security,2022,112:102513.
[5]LIU Z,WANG R,JAPKOWICZ N,et al.SeGDroid:An Android malware detection method based on sensitive function call graph learning [J].Expert Systems with Applications,2024,235:121125.
[6]ARP D,SPREITZENBARTH M,HUBNER M,et al.Drebin:Effective and explainable detection of android malware in your pocket [C]//Proceedings of the Ndss 2014.2014:23-26.
[7]BRUZZESE R.Building visual malware dataset using Vir-usShare data and comparing machine learning baseline model to CoAtNet for malware classification [C]//Proceedings of the 2024 16th International Conference on Machine Learning and Computing.2024:185-193.
[8]QURESHI M A,GILL A M,SADAF M.APK Insight:Revolutionizing Forensic Analysis with a User-Friendly Approach [C]//2024 International Conference on Engineering & Computing Technologies(ICECT).IEEE,2024:1-6.
[9]BHOOSHAN P,SONKAR N.Comprehensive Android Malware Detection:Leveraging Machine Learning and Sandboxing Techniques Through Static and Dynamic Analysis [C]//2024 IEEE 21st International Conference on Mobile Ad-Hoc and Smart Systems(MASS).IEEE,2024:580-585.
[10]HALL-ANDERSEN M,SIMKIN M,WAGNER B.FRIDA:Data availabilitysampling from FRI [C]//Annual International Cryptology Conference.Cham:Springer Nature Switzerland,2024:289-324.
[11]KHAN S A,ADNAN M,ALI A,et al.An Android applications vulnerability analysis usingMobSF [C]//International Confe-rence on Engineering & Computing Technologies(ICECT 2024).IEEE,2024:1-7.
[12]YANG C,XU Z,GU G,et al.Droidminer:Automated mining and characterization of fine-grained malicious behaviors in android applications [C]//19th European Symposium on Research in Computer Security(ESORICS 2014).Wroclaw,Poland,Part I.Wroclaw:Springer International Publishing,2014:163-182.
[13]YANG W,XIAO X,ANDOW B,et al.Appcontext:Differentiating malicious and benign mobile app behaviors using context [C]//Proceedings of the 37th IEEE/ACM International Conference on Software Engineering.IEEE,2015:303-313.
[14]PENDLEBURY F,PIERAZZI F,JORDANEY R,et al.TES-SERACT:Eliminating experimental bias in malware classification across space and time [C]//Proceedings of the 28th USENIX Conference on Security Symposium(SEC ’19).USA:USENIX Association,2019:729-746.
[15]ZHANG X,ZHANG Y,ZHONG M,et al.Enhancing state-of-the-art classifiers with API semantics to detect evolvedAndroid malware [C]//Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security.New York:ACM,2020:757-770.
[16]CAI M,JIANG Y,GAO C,et al.Learning features from en-hanced function call graphs for Android malware detection [J].Neurocomputing,2021,423:301-307.
[17]LO WW,LAYEGHY S,SARHAN M,et al.Graph neural network-based Android malware classification with jumping knowledge [C]//IEEE Conference on Dependable and Secure Computing(DSC 2022).IEEE,2022:1-9.
[18]WANG L,WANG H,HE R,et al.MalRadar:Demystifying Android malware in the new era [J].Proceedings of the ACM on Measurement and Analysis of Computing Systems,2022,6(2):1-27.
[19]AZEEM M,KHAN D,IFTIKHAR S,et al.Analyzing and comparing the effectiveness of malware detection:A study of machine learning approaches [J].Heliyon,2024,10(1).
[20]BAI KV,THIRUMARAN M.Hybrid Deep Learning and Behavioral Analysis for Enhanced Malware Detection in Banking [C]//8th International Conference on Electronics,Communication and Aerospace Technology(ICECA 2024).IEEE,2024:1168-1173.
[21]ALLIX K,BISSYANDÉ T F,KLEIN J,et al.Androzoo:Collecting millions of Android apps for the research community [C]//Proceedings of the 13th International Conference on Mining Software Repositories.2016:468-471.
[22]SUN Z,WANG G,LI P,et al.An improved random forest based on the classification accuracy and correlation measurement of decision trees [J].Expert Systems with Applications,2024,237:121549.
[23]DESAI M,SHAH M.An anatomizationon breast cancer detection and diagnosis employing multi-layer perceptron neural network(MLP) and Convolutional neural network(CNN) [J].Clinical eHealth,2021,4:1-1.
[24]KOOPIALIPOOR M,ASTERIS P G,MOHAMMED A S,et al.Introducing stacking machine learning approaches for the prediction of rock deformation [J].Transportation Geotechnics,2022,34:100756.
[1] FU Chao, YU Liangju, CHANG Wenjun. Selective Ensemble Learning Method for Optimal Similarity Based on LLaMa3 and Choquet Integrals [J]. Computer Science, 2025, 52(9): 80-87.
[2] LIU Sixing, XU Shuoyang, XU He, JI Yimu. Machine Learning Based Interventional Glucose Sensor Fault Monitoring Model [J]. Computer Science, 2025, 52(9): 106-118.
[3] BAO Shenghong, YAO Youjian, LI Xiaoya, CHEN Wen. Integrated PU Learning Method PUEVD and Its Application in Software Source CodeVulnerability Detection [J]. Computer Science, 2025, 52(6A): 241100144-9.
[4] LIU Chengming, LI Haixia, LI Shaochuan, LI Yinghao. Ensemble Learning Model for Stock Manipulation Detection Based on Multi-scale Data [J]. Computer Science, 2025, 52(6A): 240700108-8.
[5] ZHAO Yingnan, LENG Chongyang, HAN Qilong, YU Cheng. Adaptive Android Program Test Method Based on Thompson Sampling [J]. Computer Science, 2025, 52(11): 330-338.
[6] GUO Jiaming, DU Wentao, YANG Chao. Neural Network Backdoor Sample Filtering Method Based on Deep Partition Aggregation [J]. Computer Science, 2025, 52(11): 425-433.
[7] HAN Wei, JIANG Shujuan, ZHOU Wei. Patch Correctness Verification Method Based on CodeBERT and Stacking Ensemble Learning [J]. Computer Science, 2025, 52(1): 250-258.
[8] LU Xulin, LI Zhihua. IoT Device Recognition Method Combining Multimodal IoT Device Fingerprint and Ensemble Learning [J]. Computer Science, 2024, 51(9): 371-382.
[9] LIANG Meiyan, FAN Yingying, WANG Lin. Fine-grained Colon Pathology Images Classification Based on Heterogeneous Ensemble Learningwith Multi-distance Measures [J]. Computer Science, 2024, 51(6A): 230400043-7.
[10] LI Xinrui, ZHANG Yanfang, KANG Xiaodong, LI Bo, HAN Junling. Intelligent Diagnosis of Brain Tumor with MRI Based on Ensemble Learning [J]. Computer Science, 2024, 51(6A): 230600043-7.
[11] ZHUO Peiyan, ZHANG Yaona, LIU Wei, LIU Zijin, SONG You. CTGANBoost:Credit Fraud Detection Based on CTGAN and Boosting [J]. Computer Science, 2024, 51(6A): 230600199-7.
[12] KANG Wei, LI Lihui, WEN Yimin. Semi-supervised Classification of Data Stream with Concept Drift Based on Clustering Model Reuse [J]. Computer Science, 2024, 51(4): 124-131.
[13] HE Jiaojun, CAI Manchun, LU Tianliang. Android Malware Detection Method Based on GCN and BiLSTM [J]. Computer Science, 2024, 51(4): 388-395.
[14] BAI Jianghao, PIAO Yong. Ensemble Learning Based Open Source License Detection and Compatibility Assessment [J]. Computer Science, 2024, 51(12): 79-86.
[15] MA Qimin, LI Xiangmin, ZHOU Yaqian. Large Language Model-based Method for Mobile App Accessibility Enhancement [J]. Computer Science, 2024, 51(12): 223-233.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!