Computer Science ›› 2024, Vol. 51 ›› Issue (4): 388-395.doi: 10.11896/jsjkx.230100002

• Information Security • Previous Articles    

Android Malware Detection Method Based on GCN and BiLSTM

HE Jiaojun, CAI Manchun, LU Tianliang   

  1. School of Information Cyber Security,People's Public Security University of China,Beijing 100038,China
  • Received:2023-01-03 Revised:2023-04-03 Online:2024-04-15 Published:2024-04-10
  • Supported by:
    Major Project of Basic Scientific Research Expenses of People's Public Security University of China in 2022(2022JKF02009)and Key Project of Public Security Risk Prevention,Control and Emergency Technical Equipment(20200017).

Abstract: Most of the existing Android malware detection methods learn features of a single structure type,and there are shortcomings in analyzing application semantics.Aiming at the problem that the traditional detection methods are not comprehensive enough in capturing feature semantics,this paper innovatively proposes an Android malware detection model based on GCN and BiLSTM.At the same time,the semantic of malicious behavior is analyzed emphatically while the sample structure information is extracted accurately.Firstly,the topological relationship between 26 types of key system calls is represented in the graph,and the two-layer GCN network is used to aggregate the high-order structure information of nodes in the system call graph to effectively improve the feature learning efficiency.Then,the BiLSTM network with self-attention mechanism is used to obtain the context semantics of opcode sequence.By assigning high weights to sequences with malicious features,the strong correlation within features is obtained.Finally,Softmax is used to output the sample classification probability fused with structural information and context features.In the experiments based on Drebin and AndroZoo datasets,the accuracy of the proposed model reaches 93.95%,and the F1 value reaches 0.97,which is significantly improved compared with the benchmark algorithm.It fully proves that the proposed model based on GCN and BiLSTM can effectively discriminate the properties of applications and improve the detection effect of Android malware.

Key words: Android, Malware detection, GCN, BiLSTM

CLC Number: 

  • TP309
[1]360 Internet Security Center.2021 China Mobile Phone Security Status Report.[EB/OL].(2022-01-25)[2022-02-08].https://pop.shouji.360.cn/safe_report/Mobile-Security-Report-202112.pdf.
[2]SCARSELLI F,GORI M,TSOI A C,et al.The graph neural network model[J].IEEE Transactions on Neural Networks,2008,20(1):61-80.
[3]YAO J P,YUAN C,LI X J,et al.Interpretive subgraph generation model for knowledge graph link prediction task[J].Application Research of Computers,2024,41(2):357-380.
[4]PFEIFER B,SARANTI A,HOLZINGER A.Gnn-subnet:Di-sease subnetwork detection with explainable graph neural networks[J].Bioinformatics,2022,38(Supplement_2):ii120-ii126.
[5]LI K,HUANG Z H.Noise Filtering and Feature Enhancement Based Graph Neural Network Method for Fraud Detection.[J].Acta Electronica Sinica,2023,51(11):3053-3060.
[6]LI L X.Research and implementation of heterogeneous graphembedding for Android malware detection[D].Beijing:Beijing University of Posts and Telecommunications,2021.
[7]FAN Y,HOU S,ZHANG Y,et al.Gotcha-Sly malware! scor-pion a metagraph2vec based malware detection system[C]//Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.2018:253-262.
[8]YE Y,HOU S,CHEN L,et al.Out-of-sample node representation learning for heterogeneous graph inreal-time android malware detection[C]//28th International Joint Conference on Artificial Intelligence(IJCAI).2019.
[9]HOU S,YE Y,SONG Y,et al.Hindroid:An intelligent android malware detection system based on structured heterogeneous information network[C]//Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2017:1507-1515.
[10]JOHN T S,THOMAS T,EMMANUEL S.Graph convolutional networks for Android malware detection with system call graphs[C]//2020 Third ISEA Conference on Security and Privacy(ISEA-ISAP).IEEE,2020:162-170.
[11]WU Y,ZOU D,YANG W,et al.HomDroid:detecting Android covert malware by social-network homophily analysis[C]//Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis.2021:216-229.
[12]MCLAUGHLIN N,MARTINEZ DEL RINCON J,KANG B J,et al.Deep android malware detection[C]//Proceedings of the Seventh ACM on Conference on Data and Application Security and Privacy.2017:301-308.
[13]LI J L,WANG Y Z,LUO L G,et al.A Survey of Adversarial Attack Techniques for Android Malware Detection[J].Journal of Cyber Security,2021,6(4):28-43.
[14]TANG Y W,LIU X.A Malicious Code Detection Method Based On Bi-Lstm And Self-Attention[J].Computer Applications and Software,2021,38:327-329.
[15]ONWUZURIKE L,MARICONTI E,ANDRIOTIS P,et al.Mamadroid:Detecting android malware by building markov chains of behavioral models(extended version)[J].ACM Transa-ctions on Privacy and Security(TOPS),2019,22(2):1-34.
[16]LIU H,ZHENG C,LI D,et al.Multi-perspective social recommendation method with graph representation learning[J].Neurocomputing,2022,468:469-481.
[17]VELIČKOVIĆ P,CUCURULL G,CASANOVA A,et al.Graph attention networks[J].arXiv:1710.10903,2017.
[18]HAMILTON W L,YING R,LESKOVEC J.Inductive rep-resentation learning on large graphs[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems.2017:1025-1035.
[19]ZHANG X,ZHANG Y,ZHONG M,et al.Enhancing state-of-the-art classifiers with api semantics to detect evolved android malware[C]//Proceedings of the 2020 ACM SIGSAC Confe-rence on Computer and Communications Security.2020:757-770.
[20]WU Y,LI X,ZOU D,et al.Malscan:Fast market-wide mobile malware scanning by social-network centrality analysis[C]//2019 34th IEEE/ACM International Conference on Automated Software Engineering(ASE).IEEE,2019:139-150.
[21]KIPF T N.WELLING M.Semi-supervised classification withgraph convolutional networks[J].arXiv:1609.02907,2016.
[22]HOU S,FAN Y,JU M,et al.Disentangled representation lear-ning in heterogeneous information network for large-scale android malware detection in the covid-19 era and beyond[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2021:7754-7761.
[23]GAO H,CHENG S,ZHANG W.GDroid:Android malware detection and classification with graph convolutional network[J].Computers & Security,2021,106:102264.
[24]YE Z,KUMAR Y J,SING G O,et al.A comprehensive survey of graph neural networks for knowledge graphs[J].IEEE Access,2022,10:75729-75741.
[25]LI Q,HAN Z,WU X M.Deeper insights into graph convolu-tional networks for semi-supervised learning[C]//Thirty-Se-cond AAAI Conference on Artificial Intelligence.2018.
[26]BAYAZIT E C,SAHINGOZ O K,DOGAN B.A Deep Learning Based Android Malware Detection System with Static Analysis[C]//2022 International Congress on Human-Computer Interaction,Optimization and Robotic Applications(HORA).IEEE,2022:1-6.
[27]ARP D,SPREITZENBARTH M,HUBNER M,et al.Drebin:Effective and explainable detection of android malware in your pocket[C]//NDSS.2014,14:23-26.
[28]ALLIX K,BISSYANDÉ T F,KLEIN J,et al.Androzoo:Collecting millions of android apps for the research commu-nity[C]//2016 IEEE/ACM 13th Working Conference on Mining Software Repositories(MSR).IEEE,2016:468-471.
[29]XU C,MCAULEY J.A survey on model compression for natural language processing[J].arXiv:2202.07105,2022.
[30]VAN DER MAATEN L,HINTON G.Visualizing data usingt-SNE[J].Journal of Machine Learning Research,2008,9(11):2579-2605.
[1] YUAN Jiangfeng, LI Haoxiang, YOU Wei, HUANG Jianjun, SHI Wenchang, LIANG Bin. Locating Third-party Library Functions in Obfuscated Applications [J]. Computer Science, 2023, 50(7): 293-301.
[2] LI Kun, GUO Wei, ZHANG Fan, DU Jiayu, YANG Meiyue. Adversarial Malware Generation Method Based on Genetic Algorithm [J]. Computer Science, 2023, 50(7): 325-331.
[3] FU Xiong, NIE Xiaohan, WANG Junchang. Study on Android Fake Application Detection Method Based on Interface Similarity [J]. Computer Science, 2023, 50(6A): 220300114-7.
[4] XING Ying. Review of Software Engineering Techniques and Methods Based on Explainable Artificial Intelligence [J]. Computer Science, 2023, 50(5): 3-11.
[5] YU Xingzhan, LU Tianliang, DU Yanhui, WANG Xirui, YANG Cheng. Android Malware Family Classification Method Based on Synthetic Image and Xception Improved Model [J]. Computer Science, 2023, 50(4): 351-358.
[6] DU Qiming, LI Nan, LIU Wenfu, YANG Shudan, YUE Feng. Sentiment Analysis of Chinese Short Text Combining Context and Dependent Syntactic Information [J]. Computer Science, 2023, 50(3): 307-314.
[7] LI Xiangmin, SHEN Liwei, DONG Zhen. Mobile Application Accessibility Enhancement Method Based on Recording and Playback [J]. Computer Science, 2023, 50(12): 32-48.
[8] WANG Xi, ZHAO Chunlei, BU Zhiliang, YANG Yi. Automated Testing Method of Android Applications Based on SA-UCB Algorithm [J]. Computer Science, 2023, 50(11A): 221200145-7.
[9] YANG Zhijun, HUANG Wenjie, DING Hongwei. Performance Analysis of Multi-server Gated Service System Based on BiLSTM Neural Networks [J]. Computer Science, 2023, 50(10): 266-274.
[10] DING Xuhui, ZHANG Linlin, ZHAO Kai, WANG Xusheng. Android Application Privacy Disclosure Detection Method Based on Static and Dynamic Combination [J]. Computer Science, 2023, 50(10): 327-335.
[11] JIN Fang-yan, WANG Xiu-li. Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM [J]. Computer Science, 2022, 49(7): 179-186.
[12] KANG Yan, WU Zhi-wei, KOU Yong-qi, ZHANG Lan, XIE Si-yu, LI Hao. Deep Integrated Learning Software Requirement Classification Fusing Bert and Graph Convolution [J]. Computer Science, 2022, 49(6A): 150-158.
[13] WANG Yi, LI Zheng-hao, CHEN Xing. Recommendation of Android Application Services via User Scenarios [J]. Computer Science, 2022, 49(6A): 267-271.
[14] YAO Ye, ZHU Yi-an, QIAN Liang, JIA Yao, ZHANG Li-xiang, LIU Rui-liang. Android Malware Detection Method Based on Heterogeneous Model Fusion [J]. Computer Science, 2022, 49(6A): 508-515.
[15] DING Feng, SUN Xiao. Negative-emotion Opinion Target Extraction Based on Attention and BiLSTM-CRF [J]. Computer Science, 2022, 49(2): 223-230.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!