计算机科学 ›› 2023, Vol. 50 ›› Issue (3): 391-398.doi: 10.11896/jsjkx.220200182

• 信息安全 • 上一篇    

基于深度学习的勒索软件早期检测方法

刘文静, 郭春, 申国伟, 谢博, 吕晓丹   

  1. 贵州大学计算机科学与技术学院公共大数据国家重点实验室 贵阳 550025
  • 收稿日期:2022-02-28 修回日期:2022-06-20 出版日期:2023-03-15 发布日期:2023-03-15
  • 通讯作者: 郭春(gc_gzedu@163.com)
  • 作者简介:(lwj_gzedu@163.com)
  • 基金资助:
    国家自然科学基金(62162009);贵州省自然科学基金(黔科合基础[2020]1Y268);贵州省科技计划项目(黔科合重大专项字[2018]3001)

Ransomware Early Detection Method Based on Deep Learning

LIU Wenjing, GUO Chun, SHEN Guowei, XIE Bo, LYU Xiaodan   

  1. State Key Laboratory of Public Big Data,College of Computer Science and Technology,Guizhou University,Guiyang 550025,China
  • Received:2022-02-28 Revised:2022-06-20 Online:2023-03-15 Published:2023-03-15
  • About author:LIU Wenjing,born in 1996,postgra-duate,is a member of China Computer Federation.Her main research interests include network and information secu-rity.
    GUO Chun,born in 1986,Ph.D,asso-ciate professor,is a member of China Computer Federation.His main research interests include malware analysis,data mining and intrusion detection.
  • Supported by:
    National Natural Science Foundation of China(62162009),Science and Technology Foundation of Guizhou Pro-vince,China([2020]1Y268) and Guizhou Provincial Science and Technology Project([2018]3001).

摘要: 近年来,勒索软件的活跃度高居不下,给社会造成了严重的经济损失。文件一旦被勒索软件加密后将难以恢复,因此如何及时且准确地检测出勒索软件成为了当前的研究热点。为了提升勒索软件检测的及时性和准确性,在分析多种勒索软件家族与良性软件运行初期行为的基础上,提出了一种基于深度学习的勒索软件早期检测方法(Ransomware Early Detection Method Based on Deep Learning,REDMDL)。REDMDL以软件运行初期所调用的一定长度的应用程序编程接口(Application Programming Interface,API)序列为输入,结合词向量和位置向量对API序列进行向量化表征,再构建深度卷积网络与长短时记忆网络(Convolutional Neural Network-Long Short Term Memory,CNN-LSTM)相结合的神经网络模型,来实现对勒索软件的早期检测。实验结果显示,REDMDL能够在一个软件运行后数秒内高准确率地判定其是勒索软件还是良性软件。

关键词: 勒索软件, 早期检测, CNN, LSTM, API

Abstract: In recent years,ransomware is becoming increasingly prevalent,causing serious economic losses.Since files encrypted by ransomware are difficult to recover,how to timely and accurately detect ransomware is a hot point nowadays.To improve the timeliness and accuracy of ransomware detection,this paper analyzes the behavior of ransomware family and benign software in the early stage of operation and proposes a ransomware early detection method based on deep learning(REDMDL).REDMDL takes a certain length of application programming interface(API) sequence that is obtained by software running at the initial stage as input,combines word vector and position vector to vectorize the collected API sequence,and then constructs a convolutional neural network-long short term memory(CNN-LSTM) neural network model for early detection of ransomware.Experimental results show that REDMDL can accurately determine whether the software is ransomware or benign within seconds after it star-ting to run.

Key words: Ransomware, Early detection, CNN, LSTM, API

中图分类号: 

  • TP309
[1]GREENGARDS.The worsening state of ransomware[J].Communications of the ACM,2021,64(4):15-17.
[2]MOUSSAILEB R,CUPPENS N,LANET J L,et al.A Survey onWindows-based Ransomware Taxonomy and Detection Mechanisms[J].ACM Computing Surveys,2022,54(6):1-36.
[3]FreeBuf.FBI透露,近六年里支付给勒索攻击者的赎金超过1.4亿美金[EB/OL].(2020-02-28) [2021-10-27].https://www.freebuf.com/news/228665.html.
[4]WANG H Z,CHEN J,CHEN X Y,et al.An Android Ransomware Detection Scheme Based on Evidence Chain Generation[J].Chinese Journal of Computers,2018,41(10):2344-2358.
[5]HWANG J,KIM J,LEE S,et al.Two-Stage Ransomware Detec-tion Using Dynamic Analysis and Machine Learning Techniques[J].Wireless Personal Communications,2020,2020(2):1-13.
[6]YILMAZ Y,CETIN O,ARIEFB,et al.Investigating the impact of ransomware splash screens[J].Journal of Information Security and Applications,2021,61:102934.
[7]Al-RIMY B A S,MAAROF M A,SHAID S Z M.Ransomwarethreat success factors,taxonomy,and countermeasures:a survey and research directions[J].Computers & Security,2018,74:144-166.
[8]ZIMBA A,WANG Z S,CHISHIMBA M.Addressing Crypto-Ransomware Attacks:Before You Decide whether To-Pay or Not-To[J].Journal of Computer Information Systems,2021,61(1):53-63.
[9]XIA T,SUN Y,ZHU S,et al.Toward A Network-Assisted Approach for Effective Ransomware Detection[J/OL].EAI Endorsed Transactions on Security and Safety,2021,7(24):e3.https://eudl.eu/doi/10.4108/eai.28-1-2021.168506.
[10]HAYES K.Ransomware:a growing geopolitical threat[J].Net-work Security,2021,2021(8):11-13.
[11]BAJPAI P,ENBODY R.Dissecting.NET ransomware:key ge-neration,encryption and operation[J].Network Security,2020,2020(2):8-14.
[12]LIU H,GUO C,CUI Y,et al.2-SPIFF:a 2-stage packer identification method based on function call graph and file attributes[J].Applied Intelligence,2021,51(12):9038-9053.
[13]SUN G S,QIAN Q.Deep Learning and Visualization for Identi-fying Malware Families[J].IEEE Transactions on Dependable and Secure Computing,2021,18(18):283-295.
[14]KHAMMAS B.Ransomware Detection using Random ForestTechnique[J].ICT Express,2020,6(4):325-331.
[15]GUO C,CHEN C Q,SHEN G Y,et al.A Ransomware Classification Method Based on Visualization[J].Netinfo Security,2020,20(4):31-39.
[16]ZHANG H,XIAO X,MERCALDO F,et al.Classification of ransomware families with machine learning based on N-gram of opcodes[J].Future Generation Computer Systems,2019,90:211-221.
[17]XIAO W,ZHANG B,XIAO X,et al.Ransomware classification using patch-based CNN and self-attention network on embedded N-grams of opcodes[J].Future Generation Computer Systems,2020,110:708-720.
[18]HSU C M,YANG C C,CHENG H H,et al.Enhancing File Entropy Analysis to Improve Machine Learning Detection Rate of Ransomware[J].IEEE Access,2021,9:138345-138351.
[19]KHARRAZ A,ARSHAD S,MULLINERC,et al.UNVEIL:A Large-Scale,Automated Approach to Detecting Ransomware[C]//USENIX Security Symposium.Austin:Association,2016:757-772.
[20]RAMESH G,MENEN A.Automated dynamic approach for detecting ransomware using finite-state machine[J/OL].Decision Support Systems,2020,138:113400.https://www.sciencedirect.com/science/article/abs/pii/S016792362030155X?via%3Dihub.
[21]PEÑA A J,ULLAH F,JAVAID Q,et al.Modified DecisionTree Technique for Ransomware Detection at Runtime through API Calls[J].Scientific Programming,2020,2020:8845833.
[22]DAKU H,ZAVARSKY P,MALIK Y.Behavioral-Based Classification and Identification of Ransomware Variants Using Machine Learning[C]//2018 17th IEEE International Conference On Trust,Security And Privacy In Computing And Communications.Washington:IEEE Computer Society,2018:1560-1564.
[23]SCAIFE N,CARTER H,TRAYNOR P,et al.Cryptolock(and drop it):stopping ransomware attacks on user data[C]//2016 IEEE 36th International Conference on Distributed Computing Systems(ICDCS).Washington:IEEE Computer Society,2016:303-312.
[24]MORATO D,BERRUETA E,MAGAÑA E,et al.Ransomware early detection by the analysis of file sharing traffic[J].Journal of Network and Computer Applications,2018,124:14-32.
[25]AL-RIMY B,MAAROF M A,ALAZAB M,et al.Redundancy Coefficient Gradual Up-weighting-based Mutual Information Feature Selection technique for Crypto-ransomware early detection[J].Future Generation Computer Systems,2021,115:641-658.
[26]CHEN C Q,GUO C,CUI Y H,et al.Ransomw-are Early Detection Method Based on Short API Sequence[J].Acta Electronica Sinica,2021,49(3):586-595.
[27]VIVEKANANDAN K,PRAVEENA N.Hybrid convolutionalneural network(CNN) and long-short term memory(LSTM) based deep learning model for detecting shilling attack in the social-aware network[J].Journal of Ambient Intelligence and Humanized Computing,2021,12(1):1197-1210.
[28]BORAH P,BHATTACHARYYA D K,KALITA J K.Cost Effective Method for Ransomware Detection:An Ensemble Approach[C]//International Conference on Distributed Computing and Internet Technology 2021.Washington:IEEE Computer Society,2021,12582:203-219.
[29]Al-RIMY B,MAAROF M A,SHAID S.Crypto-ransomwareearly detection model using novel incremental bagging with enhanced semi-random subspace selection[J].Future Generation Computer Systems,2019,101:476-491.
[30]LI Z Q,LI T.Query-by-Example with Acoustic Word Embeddings Using wav2vec Pretraining[J].Computer Science,2022,49(1):59-64.
[31]SUDHAKAR,KUMAR S.MCFT-CNN:Malware classification with fine-tune convolution neural networks using traditional and transfer learning in Internet of Things[J].Future Generation Computer Systems,2021,125:334-351.
[32]YOUSFI S,RHANOUI M,MIKRAM M.Comparative Study of CNN and LSTM for Opinion Mining in Long Text[J].Journal of Automation,2019,14(3):50-55.
[33]LIU X X,JI Y,LIU C P.Voiceprint Recognition Based onLSTM Neural Network[J].Computer Science,2021,48(S2):270-274.
[34]HOCHREITER S,SCHMIDHUBER J.Long Short-Term Me-mory[J].Neural Computation,1997,9(8):1735-1780.
[35]QIN B,WANG Y,MA C.API Call Based Ransomware Dynamic Detection Approach UsingTextCNN[C]//2020 International Conference on Big Data,Artificial Intelligence and Internet of Things Engineering(ICBAIE).Washington:IEEE Computer Society,2020:162-166.
[1] 李帅, 徐彬, 韩祎珂, 廖同鑫.
SS-GCN:情感增强和句法增强的方面级情感分析模型
SS-GCN:Aspect-based Sentiment Analysis Model with Affective Enhancement and Syntactic Enhancement
计算机科学, 2023, 50(3): 3-11. https://doi.org/10.11896/jsjkx.220700238
[2] 张源, 康乐, 宫朝辉, 张志鸿.
基于Bi-LSTM的期货市场关联交易行为检测方法
Related Transaction Behavior Detection in Futures Market Based on Bi-LSTM
计算机科学, 2022, 49(7): 31-39. https://doi.org/10.11896/jsjkx.210400304
[3] 于家畦, 康晓东, 白程程, 刘汉卿.
一种新的中文电子病历文本检索模型
New Text Retrieval Model of Chinese Electronic Medical Records
计算机科学, 2022, 49(6A): 32-38. https://doi.org/10.11896/jsjkx.210400198
[4] 岳晴, 尹健宇, 王生生.
基于改进CNN的低剂量CT图像肺结节自动检测
Automatic Detection of Pulmonary Nodules in Low-dose CT Images Based on Improved CNN
计算机科学, 2022, 49(6A): 54-59. https://doi.org/10.11896/jsjkx.210400211
[5] 林夕, 陈孜卓, 王中卿.
基于不平衡数据与集成学习的属性级情感分类
Aspect-level Sentiment Classification Based on Imbalanced Data and Ensemble Learning
计算机科学, 2022, 49(6A): 144-149. https://doi.org/10.11896/jsjkx.210500205
[6] 余本功, 张子薇, 王惠灵.
一种融合多层次情感和主题信息的TS-AC-EWM在线商品排序方法
TS-AC-EWM Online Product Ranking Method Based on Multi-level Emotion and Topic Information
计算机科学, 2022, 49(6A): 165-171. https://doi.org/10.11896/jsjkx.210400238
[7] 祝文韬, 兰先超, 罗唤霖, 岳彬, 汪洋.
改进Faster R-CNN的光学遥感飞机目标检测
Remote Sensing Aircraft Target Detection Based on Improved Faster R-CNN
计算机科学, 2022, 49(6A): 378-383. https://doi.org/10.11896/jsjkx.210300121
[8] 王杉, 徐楚怡, 师春香, 张瑛.
基于CNN-LSTM的卫星云图云分类方法研究
Study on Cloud Classification Method of Satellite Cloud Images Based on CNN-LSTM
计算机科学, 2022, 49(6A): 675-679. https://doi.org/10.11896/jsjkx.210300177
[9] 赵征鹏, 李俊钢, 普园媛.
基于卷积神经网络的Retinex低照度图像增强
Low-light Image Enhancement Based on Retinex Theory by Convolutional Neural Network
计算机科学, 2022, 49(6): 199-209. https://doi.org/10.11896/jsjkx.210400092
[10] 王毅, 陈迎仁, 陈星, 林兵, 马郓.
基于计算反射的Android应用程序接口自动生成方法
Automating Release of Android APIs Based on Computational Reflection
计算机科学, 2022, 49(12): 136-145. https://doi.org/10.11896/jsjkx.211100066
[11] 韦入铭, 陈若愚, 李晗, 刘旭红.
基于深度学习与文本计量的技术趋势分析
Analysis of Technology Trends Based on Deep Learning and Text Measurement
计算机科学, 2022, 49(11A): 211100119-6. https://doi.org/10.11896/jsjkx.211100119
[12] 黄玉娇, 詹李超, 范兴刚, 肖杰, 龙海霞.
基于知识蒸馏模型ELECTRA-base-BiLSTM的文本分类
Text Classification Based on Knowledge Distillation Model ELECTRA-base-BiLSTM
计算机科学, 2022, 49(11A): 211200181-6. https://doi.org/10.11896/jsjkx.211200181
[13] 李康乐, 任志磊, 周志德, 江贺.
基于决策树算法的API误用检测
Decision Tree Algorithm-based API Misuse Detection
计算机科学, 2022, 49(11): 30-38. https://doi.org/10.11896/jsjkx.211100177
[14] 袁景凌, 丁远远, 盛德明, 李琳.
基于视觉方面注意力的图像文本情感分析模型
Image-Text Sentiment Analysis Model Based on Visual Aspect Attention
计算机科学, 2022, 49(1): 219-224. https://doi.org/10.11896/jsjkx.201000074
[15] 黄晓生, 徐静.
基于PCANet的非下采样剪切波域多聚焦图像融合
Multi-focus Image Fusion Method Based on PCANet in NSST Domain
计算机科学, 2021, 48(9): 181-186. https://doi.org/10.11896/jsjkx.200800064
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!