计算机科学 ›› 2023, Vol. 50 ›› Issue (4): 351-358.doi: 10.11896/jsjkx.220300200

• 信息安全 • 上一篇    下一篇

基于合成图像和Xception改进模型的安卓恶意家族分类方法

于兴崭, 芦天亮, 杜彦辉, 王曦锐, 杨成   

  1. 中国人民公安大学信息网络安全学院 北京 100038
  • 收稿日期:2022-03-21 修回日期:2022-06-13 出版日期:2023-04-15 发布日期:2023-04-06
  • 通讯作者: 芦天亮(lutianliang@ppsuc.edu.cn)
  • 作者简介:(353823546@qq.com)
  • 基金资助:
    国家社会科学基金重点项目(20AZD114);中国人民公安大学基本科研业务费重大项目(2020JKF101)

Android Malware Family Classification Method Based on Synthetic Image and Xception Improved Model

YU Xingzhan, LU Tianliang, DU Yanhui, WANG Xirui, YANG Cheng   

  1. Collage of Information and Cyber Security,People’s Public Security University of China,Beijing 100038,China
  • Received:2022-03-21 Revised:2022-06-13 Online:2023-04-15 Published:2023-04-06
  • About author:YU Xingzhan,born in 1995,master.His main research interests include cyber security and malware detection.
    LU Tianliang,born in 1985,Ph.D,associate professor,Ph.D supervisor.His main research interests include cyber security and artificial intelligence.
  • Supported by:
    National Social Science Foundation of China(20AZD114) and Fundamental Research Funds for the Central Universities(2020JKF101).

摘要: 针对安卓恶意家族检测领域存在的代码可视化方法构造的信息不充分、分类效果受数据集数量影响大、分类准确率低等问题,提出了一种基于多特征文件合成图像和Xception改进模型的安卓恶意家族分类方法。首先,选用3个特征文件对应RGB多通道合成彩色图像;然后,改进Xception模型引入focal loss函数,缓解由样本不均衡分布带来的负面影响;最后,将注意力机制融合至改进模型,从不同维度提取恶意代码图像特征,提升了模型的分类效果。实验结果表明,所提方法合成的恶意代码图像包含的特征更丰富,相比主流的恶意家族分类方法准确率更高,且对于数量分布不平衡的数据集具备更好的分类效果。

关键词: 恶意软件可视化, 安卓恶意家族分类, 注意力机制, focal loss, Xception

Abstract: Aiming at the problems in the field of Android malicious family detection,such as insufficient code visualization method construction information,large classification effect affected by the number of data sets and low classification accuracy,an Android malicious family classification method based on multi feature file synthetic image and Xception improved model is proposed.Fir-stly,three feature files corresponding to RGB multi-channel are selected to synthesize color images.Then,the improved Xception model introduces the focal loss function to alleviate the negative impact caused by the uneven distribution of samples.Finally,the attention mechanism is integrated into the improved model to extract the image features of malicious code from different dimensions,which improves the classification effect of the model.Experimental results show that the malicious code images synthesized by the proposed method contain richer features,have higher accuracy than the mainstream malicious family classification methods,and have better classification effect for unbalanced data sets.

Key words: Malware visualization, Android malware family classification, Attention mechanism, focal loss, Xception

中图分类号: 

  • TP309
[1]TAM K,FEIZOLLAH A,AMUAR N B,et al.The evolution of android malware and android analysis techniques[J].ACM Computing Surveys(CSUR),2017,49(4):1-41.
[2]FAN M,LIU J,LUO X,et al.Android malware familial classification and representative sample selection via frequent subgraph analysis[J].IEEE Transactions on Information Forensics and Security,2018,13(8):1890-1905.
[3]HSIEN-DE HUANG T T,KAO H Y.R2-d2:Color-inspiredconvolutional neural network (cnn)-based android malware detections[C]//2018 IEEE International Conference on Big Data(Big Data).IEEE,2018:2633-2642.
[4]HASEGAWA C,IYATOMI H.One-dimensional convolutional neural networks for Android malware detection[C]//2018 IEEE 14th International Colloquium on Signal Processing & Its Applications(CSPA).IEEE,2018:99-102.
[5]WANG W,ZHAO M,WANG J.Effective android malware detection with a hybrid model based on deep autoencoder and convolutional neural network[J].Journal of Ambient Intelligence and Humanized Computing,2019,10(8):3035-3043.
[6]WANG W,WANG X,FENG D,et al.Exploring permission-induced risk in android applications for malicious application detection[J].IEEE Transactions on Information Forensics and Security,2014,9(11):1869-1882.
[7]ZHU H J,YOU Z H,ZHU Z X,et al.Droiddet:Effective and robust detection of android malware using static analysis along with rotation forest model[J].Neurocomputing,2018,272:638-646.
[8]SUAREZ-TANGIL G,TAPIADOR J E,PERIS-LOPEZ P,et al.Dendroid:A text mining approach to analyzing and classi-fying code structures in android malware families[J].Expert Systems with Applications,2014,41(4):1104-1117.
[9]GORLA A,TAVECCHIA I,GROSS F,et al.Checking app behavior against app descriptions[C]//Proceedings of the 36th international conference on software engineering.2014:1025-1035.
[10]KUMAR A,SAGAR K P,KUPPUSAMY K S,et al.Machine learning based malware classification for Android applications using multimodal image representations[C]//2016 10th international conference on intelligent systems and control(ISCO).IEEE,2016:1-6.
[11]YEN Y S,SUN H M.An Android mutation malware detection based on deep learning using visualization of importance from codes[J].Microelectronics Reliability,2019,93:109-114.
[12]BAKOUR K,ÜNVER H M.DeepVisDroid:android malwaredetection by hybridizing image-based features with deep learning techniques[J].Neural Computing and Applications,2021,33(18):11499-11516.
[13]NATARAJ L,KARTHIKEYAN S,JACOB G,et al.Malware images:visualization and automatic classification[C]//Procee-dings of the 8th International Symposium on Visualization for Cyber Security.2011:1-7.
[14]LU X D,DUAN Z M,QIAN Y K,et al.Malicious code classification method based on deep forest[J].Journal of Software,2020,31(5):1454-1464.
[15]FU J,XUE J,WANG Y,et al.Malware visualization for fine-grained classification[J].IEEE Access,2018,6:14510-14523.
[16]HAN K S,KANG B J,IM E G.Malware analysis usingvisua-lized image matrices[J].The Scientific World Journal,2014,2014:1-15.
[17]KIRKLAND E J.Bilinear interpolation[M]//Advanced Computing in Electron Microscopy.Springer,Boston,MA,2010:261-263.
[18]CHOLLET F.Xception:Deep Learning with Depthwise Separable Convolutions[C]//2017 Conference on Computer Vision and Pattern Recognition(CVPR).2017:1800-1807.
[19]YANG G,SCHOENHOLZ S.Mean field residual networks:On the edge of chaos[J].Advances in Neural Information Proces-sing Systems,2017,30:7103-7114.
[20]WOO S,PARK J,LEE J Y,et al.Cbam:Convolutional block attention module[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:3-19.
[21]LIN T Y,GOYAL P,GIRSHICK R,et al.Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:2980-2988.
[22]KINGMA D P,BA J.Adam:A method for stochastic optimization[J].arXiv:1412.6980,2014.
[23]SENANAYAKE J,KALUTARAGE H,Al-KADRT M O.Android Mobile Malware Detection Using Machine Learning:A Systematic Review[J].Electronics,2021,10(13):1606.
[24]DAVULURU V S P,NARAYANAN B N,BALSTER E J.Convolutional neural networks as classification tools and feature extractors for distinguishing malware programs[C]//Proceedings of the 64th IEEE National Aerospace and Electronics Confe-rence.2019:273-278.
[25]KHAN R U,ZHANG X,KUMAR R.Analysis of ResNet and GoogleNet models for malware detection[J].Journal of Compu-ter Virology and Hacking Techniques,2019,15(1):29-37.
[26]KALASH M,ROCHAN M,MOHAMMED N,et al.Malware classification with deep convolutional neural networks[C]//2018 9th IFIP International Conference on New Technologies,Mobility and Security (NTMS).IEEE,2018:1-5.
[1] 韩雪明, 贾彩燕, 李轩涯, 张鹏飞.
传播树结构结点及路径双注意力谣言检测模型
Dual-attention Network Model on Propagation Tree Structures for Rumor Detection
计算机科学, 2023, 50(4): 22-31. https://doi.org/10.11896/jsjkx.220200037
[2] 尹恒, 张凡, 李天瑞.
基于多邻接图与多头注意力机制的短期交通流量预测
Short-time Traffic Flow Forecasting Based on Multi-adjacent Graph and Multi-head Attention Mechanism
计算机科学, 2023, 50(4): 40-46. https://doi.org/10.11896/jsjkx.220200079
[3] 雒晓辉, 吴云, 王晨星, 余文婷.
基于用户长短期偏好的序列推荐模型
Sequential Recommendation Model Based on User’s Long and Short Term Preference
计算机科学, 2023, 50(4): 47-55. https://doi.org/10.11896/jsjkx.220100264
[4] 王娅丽, 张凡, 余增, 李天瑞.
基于交互注意力和图卷积网络的方面级情感分析
Aspect-level Sentiment Classification Based on Interactive Attention and Graph Convolutional Network
计算机科学, 2023, 50(4): 196-203. https://doi.org/10.11896/jsjkx.220100105
[5] 李帅, 徐彬, 韩祎珂, 廖同鑫.
SS-GCN:情感增强和句法增强的方面级情感分析模型
SS-GCN:Aspect-based Sentiment Analysis Model with Affective Enhancement and Syntactic Enhancement
计算机科学, 2023, 50(3): 3-11. https://doi.org/10.11896/jsjkx.220700238
[6] 陈富强, 寇嘉敏, 苏利敏, 李克.
基于图神经网络的多信息优化实体对齐模型
Multi-information Optimized Entity Alignment Model Based on Graph Neural Network
计算机科学, 2023, 50(3): 34-41. https://doi.org/10.11896/jsjkx.220700242
[7] 周明强, 代开浪, 吴全旺, 朱庆生.
异构信息网络的注意力感知多通道图卷积评分预测模型
Attention-aware Multi-channel Graph Convolutional Rating Prediction Model for Heterogeneous Information Networks
计算机科学, 2023, 50(3): 129-138. https://doi.org/10.11896/jsjkx.220300004
[8] 冯程程, 刘派, 姜琳颖, 梅笑寒, 郭贵冰.
文档增强型知识库问答
Document-enhanced Question Answering over Knowledge-Bases
计算机科学, 2023, 50(3): 266-275. https://doi.org/10.11896/jsjkx.220300022
[9] 邹芸竹, 杜圣东, 滕飞, 李天瑞.
一种基于多模态深度特征融合的视觉问答模型
Visual Question Answering Model Based on Multi-modal Deep Feature Fusion
计算机科学, 2023, 50(2): 123-129. https://doi.org/10.11896/jsjkx.211200303
[10] 王鹏宇, 台文鑫, 刘芳, 钟婷, 罗绪成, 周帆.
基于数据增强的自监督飞行航迹预测
Self-supervised Flight Trajectory Prediction Based on Data Augmentation
计算机科学, 2023, 50(2): 130-137. https://doi.org/10.11896/jsjkx.211200016
[11] 瞿中, 王彩云.
基于注意力机制和轻量级空洞卷积的混凝土路面裂缝检测
Crack Detection of Concrete Pavement Based on Attention Mechanism and Lightweight DilatedConvolution
计算机科学, 2023, 50(2): 231-236. https://doi.org/10.11896/jsjkx.211200290
[12] 刘露平, 周欣, 程军军, 何小海, 卿粼波, 王美玲.
基于会话式机器阅读理解模型的事件抽取方法
Event Extraction Method Based on Conversational Machine Reading Comprehension Model
计算机科学, 2023, 50(2): 275-284. https://doi.org/10.11896/jsjkx.220400271
[13] 蔡肖, 陈志华, 盛斌.
基于移位窗口金字塔Transformer的遥感图像目标检测
SPT:Swin Pyramid Transformer for Object Detection of Remote Sensing
计算机科学, 2023, 50(1): 105-113. https://doi.org/10.11896/jsjkx.211100208
[14] 张婧媛, 王宏霞, 何沛松.
基于Transformer的多任务图像拼接篡改检测算法
Multitask Transformer-based Network for Image Splicing Manipulation Detection
计算机科学, 2023, 50(1): 114-122. https://doi.org/10.11896/jsjkx.211100269
[15] 李雪辉, 张拥军, 史殿习, 徐化池, 史燕燕.
融合注意力特征的无锚框视觉目标跟踪方法
AFTM:Anchor-free Object Tracking Method with Attention Features
计算机科学, 2023, 50(1): 138-146. https://doi.org/10.11896/jsjkx.211000083
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!