计算机科学 ›› 2026, Vol. 53 ›› Issue (4): 435-444.doi: 10.11896/jsjkx.250500078
尹创, 刘建毅, 张茹
YIN Chuang, LIU Jianyi, ZHANG Ru
摘要: 勒索软件通过加密关键数据勒索受害者支付赎金。2023年勒索导致的赎金总额已超10亿美元。精确分类勒索软件对安全防护具有重要意义,但勒索软件样本的数量往往有限。鉴于此,提出了一种跨模态融合少样本勒索软件分类器CMFu,包含特征构建模块、编码模块和融合模块。特征构建模块用于生成跨模态特征。编码模块基于两个预训练模型构建编码器,对不同模态的特征进行编码。融合模块对编码数据进行整合,实现最终的分类。实验通过设置10%,30%和50%的训练样本比例来评估模型的性能。CMFu在所有指标上均优于对比模型。当样本比例为30%时,CMFu的精确率、召回率和F1分数分别为0.91,0.91和0.90,优于所有对比模型。当样本比例降至10%时,指标仍能保持在较高水平,为0.78,0.84和0.80,证实了CMFu的少样本勒索软件分类效果,消融实验验证了基于预训练的编码器的可行性以及使用骨干网络融合的必要性。
中图分类号:
| [1]YAN S,REN J,WANG W,et al.A Survey of Adversarial Attack and Defense Methods for Malware Classification in Cyber Security[J].IEEE Communications Surveys & Tutorials,2023,25(1):467-496. [2]SHAH N,FARIK M.Ransomware-Threats,Vulnerabilities and Recommendations[J].International Journal of Scientific & Technology Research,2017,6:307-309. [3]The State of Ransomware 2023[EB/OL].(2023-05-10)[2025-06-24].https://news.sophos.com/en-us/2023/05/10/the-state-of-ransomware-2023/. [4]XUE D,LI J,LYU T,et al.Malware Classification Using Probability Scoring and Machine Learning[J].IEEE Access,2019,7:91641-91656. [5]WASOYE S,STEVENS M,MORGAN C,et al.RansomwareClassification Using BTLS Algorithm and Machine Learning Approaches[EB/OL].https://doi.org/10.21203/rs.3.rs-5131919/v1. [6]ZHU J,JANG-JACCARD J,SINGH A,et al.A few-shot meta-learning based siamese neural network using entropy features for ransomware classification[J].Computers & Security,2022,117:102691. [7]FINN C,ABBEEL P,LEVINE S.Model-agnostic meta-learning for fast adaptation of deep networks[C]//Proceedings of the 34th International Conference on Machine Learning.2017:1126-1135. [8]JI Y,ZOU K,ZOU B.Mi-MAML:Classifying few-shot advanced malware using multi-improved model-agnostic meta-learning[J].Cybersecurity,2024,7(1):72. [9]ZHAO W X,ZHOU K,LI J,et al.A Survey of Large Language Models[J].arXiv:2303.18223,2023. [10]WU J,GAN W,CHEN Z,et al.Multimodal Large Language Models:A Survey[C]//2023 IEEE International Conference on Big Data(BigData).IEEE,2023:15-18. [11]ZHONG X,BAN H.Pre-trained network-based transfer lear-ning:A small-sample machine learning approach to nuclear power plant classification problem[J].Annals of Nuclear Energy,2022,175:109201. [12]CHEN X,LIU T,FOURNIER-VIGER P,et al.A fine-grained self-adapting prompt learning approach for few-shot learning with pre-trained language models[J].Knowledge-Based Systems,2024,299:111968. [13]LIN Z,YU S,KUANG Z,et al.Multimodality helps unimodality:Cross-modal few-shot learning with multimodal models[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.IEEE,2023. [14]CHRISTODORESCU M,JHA S.Static analysis of executables to detect malicious patterns[C]//Proceedings of the 12th Conference on USENIX Security Symposium.Washington,DC:USENIX Association,2003. [15]ABUSITTA A,LI M Q,FUNG B C M.Malware classification and composition analysis:A survey of recent developments[J].Journal of Information Security and Applications,2021,59:102828. [16]FIRDAUSI I,LIM C,ERWIN A,et al.Analysis of MachineLearning Techniques Used in Behavior-Based Malware Detection[C]//2010 Second International Conference on Advances in Computing,Control,and Telecommunication Technologies.IEEE,2010:2-3. [17]KOLOSNJAJI B,ZARRAS A,WEBSTER G,et al.Deep Lear-ning for Classification of Malware System Call Sequences[C]//Australasian Joint Conference on Artificial Intelligence.Sprin-ger,2016. [18]SHARMA O,SHARMA A,KALIA A.MIGAN:GAN for faci-litating malware image synthesis with improved malware classification on novel dataset[J].Expert Systems with Applications,2024,241:122678. [19]DENG H,GUO C,SHEN G,et al.MCTVD:A malware classification method based on three-channel visualization and deep learning[J].Computers & Security,2023,126:103084. [20]ABBASI M S,AL-SAHAF H,MANSOORI M,et al.Behavior-based ransomware classification:A particle swarm optimization wrapper-based approach for feature selection[J].Applied Soft Computing,2022,121:108744. [21]AURANGZEB S,ANWAR H,NAEEM M A,et al.BigRC-EML:Big-data based ransomware classification using ensemble machine learning[J].Cluster Computing,2022,25(5):3405-3422. [22]CHAGANTI R,RAVI V,PHAM T D.Image-based malwarerepresentation approach with EfficientNet convolutional neural networks for effective malware classification[J].Journal of Information Security and Applications,2022,69:103306. [23]NI S,QIAN Q,ZHANG R.Malware identification using visua-lization images and deep learning[J].Computers & Security,2018,77:871-885. [24]CONTI M,KHANDHAR S,VINOD P.A few-shot malwareclassification approach for unknown family recognition using malware feature visualization[J].Computers & Security,2022,122:102887. [25]PARISOT A,BENTO L M S,MACHADO R C S.Ransomware Detection:Leveraging Sandbox,Text Mining Techniques and Machine Learning[C]//2024 IEEE International Workshop on Metrology for Industry 40 & IoT(MetroInd40 & IoT).IEEE,2024:29-31. [26]ZHOU Y,LIU Z,XUE J,et al.LM-cAPI:A Lite Model Based on API Core Semantic Information for Malware Classification[C]//International Conference on Applied Cryptography and Network Security.Springer,2024. [27]LISA F T,ISLAM S R,KUMAR N M.Multi-modal machine learning model for interpretable malware classification[C]//World Conference on Explainable Artificial Intelligence.Sprin-ger,2024. [28]LIAOW,LIU Z,DAI H,et al.Mask-guided BERT for few-shot text classification[J].Neurocomputing,2024,610:128576. [29]LI H,CHEN S,WANG G,et al.Enhancing Few-Shot Malware Classification Through Joint Learning of Malware Images and Opcode Sequences[C]//2024 IEEE International Symposium on Parallel and Distributed Processing with Applications(ISPA).IEEE,2024. [30]SRIVASTAVA S,SHARMA G.Omnivec:Learning robust representations with cross-modal sharing[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision.IEEE,2024. [31]YACOUBY R,AXMAN D.Probabilistic extension of precision,recall,and f1 score for more thorough evaluation of classification models[C]//Proceedings of the First Workshop on Evaluation and Comparison of NLP Systems.IEEE,2020. [32]VirusShare.com[EB/OL].(2025-06-25)[2025-06-25].https://virusshare.com/research. [33]VirusTotal[EB/OL].(2025-06-25)[2025-06-25].https://www.virustotal.com/gui/home/upload. |
| [1] | 高泰, 任艳璋, 王会青, 李颖, 王彬. KGMamba:基于Kolmogorov-Arnold网络优化图卷积网络和Mamba的基因调控网络预测模型 KGMamba:Gene Regulatory Network Prediction Model Based on Kolmogorov-Arnold Network Optimizing Graph Convolutional Network and Mamba 计算机科学, 2026, 53(4): 101-111. https://doi.org/10.11896/jsjkx.250500097 |
| [2] | 张雪芹, 王智能, 李晋生, 陆一松, 罗飞. 基于深度学习和多特征融合的时序社交网络关键节点识别 Key Node Identification in Temporal Social Networks Based on Deep Learning and Multi-feature Fusion 计算机科学, 2026, 53(4): 143-154. https://doi.org/10.11896/jsjkx.250300147 |
| [3] | 李静, 杜圣东, 史浩琛, 胡节, 杨燕, 李天瑞. 基于预训练时空解耦的交通流预测模型 Pre-trained Spatio-Temporal Decoupling-based Traffic Flow Prediction Model 计算机科学, 2026, 53(4): 155-162. https://doi.org/10.11896/jsjkx.250600047 |
| [4] | 辜波凯, 刘盾, 孙扬. STWD-DLFRD:基于序贯三支决策与深度学习的多粒度虚假评论检测方法 STWD-DLFRD:Multi-granularity Fake Review Detection via Sequential Three-way Decisions and Deep Learning 计算机科学, 2026, 53(4): 188-196. https://doi.org/10.11896/jsjkx.250500088 |
| [5] | 汪少东, 李柳军, 李蕊, 苏中振, 陆遥. 基于张量的多模态融合诊断微血管侵犯 Tensor-based Multimodal Fusion Technique to Diagnose Microvascular Invasion 计算机科学, 2026, 53(4): 284-290. https://doi.org/10.11896/jsjkx.250600188 |
| [6] | 许身健. 跨模型协同的法律文本相关性无监督表征方法研究 Cross-model Collaborative Unsupervised Representation Method for Legal Texts 计算机科学, 2026, 53(4): 356-365. https://doi.org/10.11896/jsjkx.251100003 |
| [7] | 郑诚, 班晴晴. 知识辅助和强化句法驱动的方面级情感分析 Knowledge-assisted and Reinforced Syntax-driven for Aspect-based Sentiment Analysis 计算机科学, 2026, 53(4): 406-414. https://doi.org/10.11896/jsjkx.250600117 |
| [8] | 李泽群, 丁飞. 基于双分支融合与分段域适应迁移学习的疲劳驾驶检测 Fatigue Driving Detection Based on Dual-branch Fusion and Segmented Domain AdaptationTransfer Learning 计算机科学, 2026, 53(3): 78-87. https://doi.org/10.11896/jsjkx.250500025 |
| [9] | 付昱凯, 李庆珍, 董志学, 师冬丽, 赵鹏. 基于少量目标数据和深度学习的行人重识别方法 Pedestrian Re-identification Methods Based on Limited Target Data and Deep Learning 计算机科学, 2026, 53(3): 287-294. https://doi.org/10.11896/jsjkx.260100073 |
| [10] | 徐成, 刘宇轩, 王欣, 张铖, 姚登峰, 袁家政. 大语言模型驱动的言语障碍评估方法综述 Review of Speech Disorder Assessment Methods Driven by Large Language Models 计算机科学, 2026, 53(3): 307-320. https://doi.org/10.11896/jsjkx.250300125 |
| [11] | 李雯莉, 冯小年, 钱铁云. 基于大型语言模型增广的少样本持续毒性检测 Few-shot Continuous Toxicity Detection Based on Large Language Model Augmentation 计算机科学, 2026, 53(3): 321-330. https://doi.org/10.11896/jsjkx.250600010 |
| [12] | 喻定, 李章维. 基于Transformer架构的RNA二级结构预测方法 Prediction Method of RNA Secondary Structure Based on Transformer Architecture 计算机科学, 2026, 53(3): 375-382. https://doi.org/10.11896/jsjkx.250100005 |
| [13] | 杜剑彤, 管泽礼, 薛哲. 基于多任务学习的眼科视频特征融合与多维画像 Multi-task Learning-based Ophthalmic Video Feature Fusion and Multi-dimensional Profiling 计算机科学, 2026, 53(3): 383-391. https://doi.org/10.11896/jsjkx.260200058 |
| [14] | 苏睿韬, 任炯炯, 陈少真. 基于深度学习的GIFT-128与ASCON算法神经差分区分器研究 Deep Learning-based Neural Differential Distinguishers for GIFT-128 and ASCON 计算机科学, 2026, 53(3): 453-458. https://doi.org/10.11896/jsjkx.250600176 |
| [15] | 席鹏晖, 吴夏祯, 蒋文聪, 方良达, 贺超波, 官全龙. 个性化教育资源推荐综述 Review of Personalized Educational Resource Recommendations 计算机科学, 2026, 53(2): 1-15. https://doi.org/10.11896/jsjkx.250700184 |
|
||