计算机科学 ›› 2020, Vol. 47 ›› Issue (1): 287-292.doi: 10.11896/jsjkx.181102118

• 信息安全 • 上一篇    下一篇

基于深度森林与CWGAN-GP的移动应用网络行为分类与评估

蒋鹏飞,魏松杰   

  1. (南京理工大学计算机科学与工程学院 南京210094)
  • 收稿日期:2018-11-16 发布日期:2020-01-20
  • 通讯作者: 魏松杰(swei@njust.edu.cn)
  • 基金资助:
    国家自然科学基金面上项目(61472189);赛尔网络下一代互联网技术创新项目(NGII20160105)

Classification and Evaluation of Mobile Application Network Behavior Based on Deep Forest and CWGAN-GP

JIANG Peng-fei,WEI Song-jie   

  1. (School of Computer Science and Engineering,Nanjing University of Science and Technology,Nanjing 210094,China)
  • Received:2018-11-16 Published:2020-01-20
  • About author:JIANG Peng-fei,born in 1995,postgra-duate,is not member of China Computer Federation (CCF).His main research interests include traffic analysis and deep learning;WEI Song-jie,born in 1977,Ph.D,professor,is member of China Computer Federation (CCF).His main research interests include network security,network data analysis and monitoring,abnormal event detection and simulation.
  • Supported by:
    This work was supported by the National Natural Science Foundation of China (61472189),CERNET Innovation Project (NGII20160105).

摘要: 针对目前移动应用数目庞大、功能复杂,并且其中混杂着各式各样的恶意应用等问题,面向Android平台分析了应用程序的网络行为,对不同类别的应用程序设计了合理的网络行为触发事件以模拟网络交互行为,提出了网络事件行为序列,并利用改进的深度森林模型对应用进行分类识别,最优分类准确率可达99.03%,并且其具有高精确率、高召回率、高F1-Score和低训练时间的特点。此外,为了解决应用样本数量有限且数据获取时间开销大等难题,还提出了一种使用CWGAN-GP的数据增强方法。与原始生成对抗网络相比,该模型训练更加稳定,仅需一次训练即可生成指定类别的数据。实验结果表明,在加入生成数据共同训练深度森林模型后,其分类准确率提高了9%左右。

关键词: 流量分类, 深度森林, 生成对抗网络, 网络行为, 应用分类

Abstract: In view of the problems that the large number and complex functions of mobile applications,and mixed with a variety of malicious applications,this paper analyzed the network behavior of applications for Android platform,and designed reasonable network behavior trigger events for different types of applications to simulate network interaction behavior.Based on the network event behavior sequence,the improved deep forest model is used to classify and identify applications.The optimal classification accuracy can reach 99.03%,and it has high accuracy,high recall rate,high F1-Score and low training time.In addition,in order to solve the problems of limited number of application samples and high time cost of data acquisition,a data enhancement method using CWGAN-GP was proposed.Compared with the original generative adversarial network,the training of the model is more stable,and the data of specified categories can be generated by only one training.The experimental results show that the classification accuracy is improved by about 9% after joining the generated data to train the deep forest model together.

Key words: Application classification, Deep forest, Generative adversarial network, Network behavior, Traffic classification

中图分类号: 

  • P309
[1]GHORBANZADEH M,CHEN Y,MA Z M,et al.A neural network approach to category validation of Android applications[C]∥2013 International Conference on Computing,Networking and Communications (ICNC).San Diego:IEEE,2013:740-744.
[2]HAO H K,LI Z J,YU H B.An effective approach to measuring and assessing the risk of android application[C]∥2015 International Symposium on Theoretical Aspects of Software Enginee-ring.Nanjing,China:IEEE,2015:31-38.
[3]WANG R,FENG D G,YANG Y,et al.Semantics-Based Malware Behavior Signature Extraction and Detection Method[J]. Journal of Software,2012,23(2):378-393.
[4]WEI S J,YANG L.Android Malware Characterization Based on Static Analysis of Hierarchical API Usage[J].Computer Science,2015,42(1):155-158.
[5]CHUANG H Y,WANG S D.Machine learning based hybrid behavior models for Android malware analysis[C]∥2015 IEEE International Conference on Software Quality,Reliability and Security.Vancouver:IEEE,2015:201-206.
[6]MINN S,FU S,LV T.Algorithm for exact recovery of Bayesian network for classification[J].Application Research of Compu-ters,2016,33(5):1327-1334.
[7]ESTE A,GRINGOLI F,SALGARELLI L.Support vector machines for TCP traffic classification[J].Computer Networks,2009,53(14):2476-2490.
[8]ZANDER S,NGUYEN T,ARMITAGE G.Automated traffic classification and application identification using machine lear-ning[C]∥The IEEE Conference on Local Computer Networks 30th Anniversary (LCN’05) l.Sydney:IEEE,2005:250-257.
[9]ZHOU Z H,FENG J.Deep forest:Towards an alternative to deep neural networks[J].arXiv:1702.08835,2017.
[10]WANG J Y,XU M K,WANG H Y,et al.Automated Detection of the Inconsistence between App Behavior and Privacy Policy of Android Apps[J].Journal of Frontiers of Computer Science and Technology,2019,13(1):56-69.
[11]ZHOU Z H.Ensemble methods:foundations and algorithms
[M].CRC Press,2012.
[12]CHEN T Q,GUESTRIN C.Xgboost:A scalable tree boosting system[C]∥Proceedings of the 22nd Acmsigkdd International Conference on Knowledge Discovery and Data Mining.New York:ACM,2016:785-794.
[13]WANG K F,GOU C,DUAN Y J,et al.Generative Adversarial Networks:The State of the Art and Beyond [J].Acta Automatica Sinica,2017,43(3):321-332.
[14]ARJOVSKY M,CHINTALA S,BOTTOU L.Wasserstein gan[J].arXiv:1701.07875,2017.
[15]GULRAJANI I,AHMED F,ARJOVSKY M,et al.Improved training of wasserstein gans[C]∥Advances in Neural Information Processing Systems.Long Beach,USA:Neural Informaton Processing Sysmtes,2017:5767-5777.
[1] 张佳, 董守斌.
基于评论方面级用户偏好迁移的跨领域推荐算法
Cross-domain Recommendation Based on Review Aspect-level User Preference Transfer
计算机科学, 2022, 49(9): 41-47. https://doi.org/10.11896/jsjkx.220200131
[2] 孙奇, 吉根林, 张杰.
基于非局部注意力生成对抗网络的视频异常事件检测方法
Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection
计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[3] 戴朝霞, 李锦欣, 张向东, 徐旭, 梅林, 张亮.
基于DNGAN的磁共振图像超分辨率重建算法
Super-resolution Reconstruction of MRI Based on DNGAN
计算机科学, 2022, 49(7): 113-119. https://doi.org/10.11896/jsjkx.210600105
[4] 尹文兵, 高戈, 曾邦, 王霄, 陈怡.
基于时频域生成对抗网络的语音增强算法
Speech Enhancement Based on Time-Frequency Domain GAN
计算机科学, 2022, 49(6): 187-192. https://doi.org/10.11896/jsjkx.210500114
[5] 徐辉, 康金梦, 张加万.
基于特征感知的数字壁画复原方法
Digital Mural Inpainting Method Based on Feature Perception
计算机科学, 2022, 49(6): 217-223. https://doi.org/10.11896/jsjkx.210500105
[6] 高志宇, 王天荆, 汪悦, 沈航, 白光伟.
基于生成对抗网络的5G网络流量预测方法
Traffic Prediction Method for 5G Network Based on Generative Adversarial Network
计算机科学, 2022, 49(4): 321-328. https://doi.org/10.11896/jsjkx.210300240
[7] 黎思泉, 万永菁, 蒋翠玲.
基于生成对抗网络去影像的多基频估计算法
Multiple Fundamental Frequency Estimation Algorithm Based on Generative Adversarial Networks for Image Removal
计算机科学, 2022, 49(3): 179-184. https://doi.org/10.11896/jsjkx.201200081
[8] 石达, 芦天亮, 杜彦辉, 张建岭, 暴雨轩.
基于改进CycleGAN的人脸性别伪造图像生成模型
Generation Model of Gender-forged Face Image Based on Improved CycleGAN
计算机科学, 2022, 49(2): 31-39. https://doi.org/10.11896/jsjkx.210600012
[9] 唐雨潇, 王斌君.
基于深度生成模型的人脸编辑研究进展
Research Progress of Face Editing Based on Deep Generative Model
计算机科学, 2022, 49(2): 51-61. https://doi.org/10.11896/jsjkx.210400108
[10] 李建, 郭延明, 于天元, 武与伦, 王翔汉, 老松杨.
基于生成对抗网络的多目标类别对抗样本生成算法
Multi-target Category Adversarial Example Generating Algorithm Based on GAN
计算机科学, 2022, 49(2): 83-91. https://doi.org/10.11896/jsjkx.210800130
[11] 谈馨悦, 何小海, 王正勇, 罗晓东, 卿粼波.
基于Transformer交叉注意力的文本生成图像技术
Text-to-Image Generation Technology Based on Transformer Cross Attention
计算机科学, 2022, 49(2): 107-115. https://doi.org/10.11896/jsjkx.210600085
[12] 陈贵强, 何军.
自然场景下遥感图像超分辨率重建算法研究
Study on Super-resolution Reconstruction Algorithm of Remote Sensing Images in Natural Scene
计算机科学, 2022, 49(2): 116-122. https://doi.org/10.11896/jsjkx.210700095
[13] 蒋宗礼, 樊珂, 张津丽.
基于生成对抗网络和元路径的异质网络表示学习
Generative Adversarial Network and Meta-path Based Heterogeneous Network Representation Learning
计算机科学, 2022, 49(1): 133-139. https://doi.org/10.11896/jsjkx.201000179
[14] 张玮琪, 汤轶丰, 李林燕, 胡伏原.
基于场景图的段落生成序列图像方法
Image Stream From Paragraph Method Based on Scene Graph
计算机科学, 2022, 49(1): 233-240. https://doi.org/10.11896/jsjkx.201100207
[15] 徐涛, 田崇阳, 刘才华.
基于深度学习的人群异常行为检测综述
Deep Learning for Abnormal Crowd Behavior Detection:A Review
计算机科学, 2021, 48(9): 125-134. https://doi.org/10.11896/jsjkx.201100015
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!