Computer Science ›› 2019, Vol. 46 ›› Issue (2): 145-151.doi: 10.11896/j.issn.1002-137X.2019.02.023

• Information Security • Previous Articles     Next Articles

Android Malware Detection Based on N-gram

ZHANG Zong-mei1, GUI Sheng-lin1,2, REN Fei2   

  1. School of Computer Science and Engineering,University of Electronic Science and Technology of China,Chengdu 611731,China1
    The 30th Institute of China Electronics Technology Group Corporation,Chengdu 610041,China2
  • Received:2018-01-18 Online:2019-02-25 Published:2019-02-25

Abstract: With the widespread use of Android operating system,malicious applications are constantly emerging on the Android platform,meanwhile,the means by which malicious applications evade existing detection tools are becoming increasingly complicated.In order to effectively analyze malicious behavior,more efficient detection technology is required.This paper presented and designed a static malicious detection model based on N-gram technology.The model decompiles Android APK files by reversing engineering and uses N-gram technology to extract features from bytecodes.In this way,the model avoids dependence on expert knowledge in traditional detection.At the same time,the model combines with deep belief network,which allows it to rapidly and accurately train and detect application samples.1267 malicious samples and 1200 benign samples were tested.The results show that the overall accuracy is up to 98.34%.Further more,the results of the model were compared with those of other machine learning algorithms,and the detection results of the related work were also compared.The results show that the model has better accuracy and robustness.

Key words: Android application, Malware detection, N-gram, Deep belief network, Static detection

CLC Number: 

  • TP309.5
[1]TAM K,FEIZOLLAH A,ANUAR N B,et al.The evolution of android malware and android analysis techniques[J].ACM Computing Surveys (CSUR),2017,49(4):76.
[2]ZHOU Y,JIANG X.Dissecting android malware:Characterization and evolution[C]∥2012 IEEE Symposium on Security and Privacy (SP).IEEE,2012:95-109.
[3]QING S H.Research progress on Android security[J].Journal of Software,2016,27(1):45-71.(in Chinese)
卿斯汉.Android安全研究进展[J].软件学报,2016,27(1):45-71.
[4]DESNOS A,GUEGUEN G.Android:From reversing to decompilation[C]∥Proceedings of Black Hat Abu Dhabi.2011:77-101.
[5]LI T,DONG H,YUAN C Y,et al.Description of Android malware feature based on Dalvik instructions[J].Journal of Computer Research and Development,2014,51(7):1458-1466.(in Chinese)
李挺,董航,袁春阳,等.基于Dalvik指令的Android恶意代码特征描述及验证[J].计算机研究与发展,2014,51(7):1458-1466.
[6]HOU S,DU Y H,LU T L,et al.Research on Android permission detection mechanism based on K-means algorithm[J].Application Research of Computers,2018,35(4):1165-1168.(in Chinese)
侯苏,杜彦辉,芦天亮,等.基于K-means算法的Android权限检测机制研究[J].计算机应用研究,2018,35(4):1165-1168.
[7]SHAO S D,YU H Q,FAN G S.Detecting Malware by combining API and Permission Features[J].Computer Science,2017,44(4):135-139.(in Chinese)
邵舒迪,虞慧群,范贵生.基于权限和API特征结合的Android恶意软件检测方法[J].计算机科学,2017,44(4):135-139.
[8]YANG H,ZHANG Y Q,HU Y P,et al.A malware behavior detection system of Android applications based on multi-class features[J].Chinese Journal of Computers,2014,37(1):15-27.(in Chinese)
杨欢,张玉清,胡予濮,等.基于多类特征的Android应用恶意行为检测系统[J].计算机学报,2014,37(1):15-27.
[9]HOU S,SAAS A,YE Y,et al.DroidDelver:An Android Mal- ware Detection System Using Deep Belief Network Based on API Call Blocks[C]∥International Conference on Web-Age Information Management.Springer International Publishing,2016:54-66.
[10]HINTON G E,OSINDERO S,TEH Y W.A fast learning algorithm for deep belief nets[J].Neural Computation,2006,18(7):1527-1554.
[11]YANG Z,YANG M.Leakminer:Detect information leakage on android with static taint analysis[C]∥2012 Third World Congress on Software Engineering (WCSE).IEEE,2012:101-104.
[12]GIBLER C,CRUSSELL J,ERICKSON J,et al. AndroidLeaks:Automatically Detecting Potential Privacy Leaks in Android Applications on a Large Scale[C]∥Proceedings of International Conference on Trust and Trustworthy Computing.Heidelberg:Springer,2012:291-307.
[13]ARZT S,RASTHOFER S,FRITZ C,et al.Flowdroid:Precise context,flow,field,object-sensitive and lifecycle-aware taint analysis for android apps[J].Acm Sigplan Notices,2014,49(6):259-269.
[14]BODDEN E.Inter-procedural data-flow analysis with ifds/ide and soot[C]∥Proceedings of the ACM SIGPLAN International Workshop on State of the Art in Java Program Analysis.ACM,2012:3-8.
[15]OCTEAU D,MCDANIEL P,JHA S,et al.Effective inter- component communication mapping in android with epicc:An essential step towards holistic security analysis[C]∥Procee-dings of USENIX Security Symposium.Berkeley:USENIX Association,2013:543-558.
[16]SANTOS I,PENYA Y K,DEVESA J,et al.N-grams-based File Signatures for Malware Detection[C]∥Proceedings of the 2009 International Conference on Enterprise Information Systems (ICEIS).Heidelberg:Springer,2009:317-320.
[17]APKTOOL.A tool for reverse engineering Android apk files (Version 2.3.4)[EB/OL].https://ibotpeaches.github.io/Apktool.
[18]HINTON G.A practical guide to training restricted Boltzmann machines[J].Momentum,2010,9(1):926-947.
[19]SCHMIDHUBER J.Deep learning in neural networks:An overview[J].Neural Networks,2015,61:85-117.
[20]LIU X M.Anomaly Detection of Malicious Android Application based on K-nearest Neighbor[D].Beijing:Beijing Jiaotong University,2016.(in Chinese)
刘晓明.基于KNN算法的Android应用异常检测技术研究[D].北京:北京交通大学,2016.
[21]BARROS R C,BASGALUPP M P,RÉ C,et al.A Survey of Evo- lutionary Algorithms for Decision-Tree Induction[J].IEEE Transactions on Systems Man & Cybernetics Part C,2012,42(3):291-312.
[22]GU B,SHENG V S,TAY K Y,et al.Incremental Support Vector Learning for Ordinal Regression[J].IEEE Transactions on Neural Networks & Learning Systems,2015,26(7):1403-1416.
[1] XIE Nian-nian, ZENG Fan-ping, ZHOU Ming-song, QIN Xiao-xia, LV Cheng-cheng, CHEN Zhao. Android Malware Detection with Multi-dimensional Sensitive Features [J]. Computer Science, 2019, 46(2): 95-101.
[2] ZHANG Lan, LAI Yao, YE Xiao-jun. Attention Mechanism Based Detection of Malware Call Sequences [J]. Computer Science, 2019, 46(12): 132-137.
[3] JI Xiu-juan, SUN Xiao-hui, XU Jing. Source Code Memory Leak Static Detection Based on Complex Control Flow [J]. Computer Science, 2019, 46(11A): 517-523.
[4] LONG Xing-yan, QU Dan, ZHANG Wen-lin. Attention Based Acoustics Model Combining Bottleneck Feature LONG Xing-yan QU Dan ZHANG Wen-lin [J]. Computer Science, 2019, 46(1): 260-264.
[5] RONG Feng-ping, FANG Yong, ZUO Zheng and LIU Liang. MACSPMD:Malicious API Call Sequential Pattern Mining Based Malware Detection [J]. Computer Science, 2018, 45(5): 131-138.
[6] LI Ni-ge, MA Yuan-yuan, CHEN Mu, CHEN Lu and XU Min. MacDroid:A Lightweight Kernel-level Mandatory Access Control Framework for Android [J]. Computer Science, 2017, 44(Z11): 353-356.
[7] YANG Yan and JIANG Guo-ping. Improved Method of Computer Virus Signature Automatic Extraction Based on N-Gram [J]. Computer Science, 2017, 44(Z11): 338-341.
[8] YE Yi-lin, WU Li-fa and YAN Hui-ying. Two-layer Semantics-based Security Detection Approach for Android Native Libraries [J]. Computer Science, 2017, 44(6): 161-167.
[9] GAN Lu, ZANG Lie and LI Hang. Deep Belief Network Software Defect Prediction Model [J]. Computer Science, 2017, 44(4): 229-233.
[10] SHAO Shu-di, YU Hui-qun and FAN Gui-sheng. Detecting Malware by Combining API and Permission Features [J]. Computer Science, 2017, 44(4): 135-139.
[11] SHEN Li-wei, NING Ke-cheng and ZHAO Wen-yun. Usage Habit-oriented Self-adaptive Method for Android Applications [J]. Computer Science, 2017, 44(4): 104-108.
[12] LV Zhao-jin, SHEN Li-wei and ZHAO Wen-yun. Scenario-oriented Location Method of Android Applications [J]. Computer Science, 2017, 44(2): 216-221.
[13] DU Yong-ping, CHEN Shou-qin and ZHAO Xiao-zheng. Method of Short Text Opinion Recognition Based on Feature Extension and Deep Learning [J]. Computer Science, 2017, 44(10): 283-288.
[14] QIN Yue, YU Long, TIAN Sheng-wei, ZHAO Jian-guo and FENG Guan-jun. Anaphoricity Determination of Uyghur Personal Pronouns Based on Deep Belief Network [J]. Computer Science, 2017, 44(10): 228-233.
[15] YAN Ji-wei, LI Ming-su, LU Qiong, YAN Jun and GAO Hong-yu. Analysis and Comparison of Privacy Leak Static Detection Tools for Android Applications [J]. Computer Science, 2017, 44(10): 127-133.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] . [J]. Computer Science, 2018, 1(1): 1 .
[2] LEI Li-hui and WANG Jing. Parallelization of LTL Model Checking Based on Possibility Measure[J]. Computer Science, 2018, 45(4): 71 -75 .
[3] SUN Qi, JIN Yan, HE Kun and XU Ling-xuan. Hybrid Evolutionary Algorithm for Solving Mixed Capacitated General Routing Problem[J]. Computer Science, 2018, 45(4): 76 -82 .
[4] ZHANG Jia-nan and XIAO Ming-yu. Approximation Algorithm for Weighted Mixed Domination Problem[J]. Computer Science, 2018, 45(4): 83 -88 .
[5] WU Jian-hui, HUANG Zhong-xiang, LI Wu, WU Jian-hui, PENG Xin and ZHANG Sheng. Robustness Optimization of Sequence Decision in Urban Road Construction[J]. Computer Science, 2018, 45(4): 89 -93 .
[6] SHI Wen-jun, WU Ji-gang and LUO Yu-chun. Fast and Efficient Scheduling Algorithms for Mobile Cloud Offloading[J]. Computer Science, 2018, 45(4): 94 -99 .
[7] ZHOU Yan-ping and YE Qiao-lin. L1-norm Distance Based Least Squares Twin Support Vector Machine[J]. Computer Science, 2018, 45(4): 100 -105 .
[8] LIU Bo-yi, TANG Xiang-yan and CHENG Jie-ren. Recognition Method for Corn Borer Based on Templates Matching in Muliple Growth Periods[J]. Computer Science, 2018, 45(4): 106 -111 .
[9] GENG Hai-jun, SHI Xin-gang, WANG Zhi-liang, YIN Xia and YIN Shao-ping. Energy-efficient Intra-domain Routing Algorithm Based on Directed Acyclic Graph[J]. Computer Science, 2018, 45(4): 112 -116 .
[10] CUI Qiong, LI Jian-hua, WANG Hong and NAN Ming-li. Resilience Analysis Model of Networked Command Information System Based on Node Repairability[J]. Computer Science, 2018, 45(4): 117 -121 .