基于样本嵌入的挖矿恶意软件检测方法

doi:10.11896/jsjkx.230100116

Abstract

Abstract: Due to its high profitability and anonymity,cryptocurrency mining malware poses a great threat and loss to computer users.In order to confront the threat posed by mining malware,machine learning detectors based on software static features usually select a single type of static features,or integrate the detection results of different kinds of static features through integrated learning,ignoring the internal relationship between different kinds of static features,and its detection rate remains to be discussed.This paper starts from the internal hierarchical relationship of mining malware.It extracts basic blocks,control flow graphs and function call graphs of samples as static features,trains the three-layer model to embed these features into the vector respectively,and gradually gathers the features from the bottom to the top,and finally sends top features to the classifier to detect mining malware.To simulate the detection situation in real world,it first trains the model on a relatively smaller experimental data set,and then tests the performance of the model on another much larger data set.Experiment results show that the perfor-mance of th proposed method is much better than that of some machine learning models proposed in recent years.The recall rate and accuracy rate of three-layer-embedding model is more than 7% and 3% higher than that of other models,respectively.

Key words: Cryptocurrency mining malware, Static analysis, Machine learning, Graph embedding

CLC Number:

TP311

FU Jianming, JIANG Yuqian, HE Jia, ZHENG Rui, SURI Guga, PENG Guojun. Cryptocurrency Mining Malware Detection Method Based on Sample Embedding[J].Computer Science, 2024, 51(1): 327-334.

References

[1]TEKINER E,ACAR A,ULUAGAC A S,et al.SoK:Crypto-jacking Malware[C]//2021 IEEE European Symposium on Security and Privacy(EuroS&P).IEEE,2021:120-139.
[2]PASTRANA S,SUAREZ-TANGIL G.A first look at the crypto-mining malware ecosystem:A decade of unrestricted wealth[C]//Proceedings of the Internet Measurement Conference.2019:73-86.
[3]360TS.Cryptominer,winstarnssmminer,has made a fortune bybrutally hijacking computers[EB/OL].[2021-12-31].https://blog.360totalsecurity.com/en/cryptominer-winstarnssmminer-made-fortune-brutally-hijacking-computer.
[4]TAHIR R,HUZAIFA M,DAS A,et al.Mining on someoneelse’s dime:Mitigating covert mining operations in clouds and enterprises[C]//International Symposium on Research in Attacks,Intrusions,and Defenses.Cham:Springer,2017:287-310.
[5]ESENTIRE I.Cryptocurrency craze drives 1,500% increase in coin-mining malware[EB/OL].[2021-12-31].https://www.esentire.com/news-releases/2018s-cryptocurrency-craze-helps-drive-1500-percent-increase-in-coinmining-malware.
[6]GRIFFTHS J.Coinminers target vulnerable users as bitcoin hits all-time high,[EB/OL].[2021-12-31].https://www.avira.com/en/blog/coinminers-target-vulnerable-users-as-bitcoin-hits-all-time-high/.
[7]YAN G.Be sensitive to your errors:Chaining neyman-pearsoncriteria for automated malware classification[C]//Proceedings of the 10th ACM Symposium on Information,Computer and Communications Security.2015:121-132.
[8]YOUSEFI-AZAR M,VARADHARAJAN V,HAMEY L,et al.Autoencoder-based feature learning for cyber security applications[C]//2017 International Joint Conference on Neural Networks(IJCNN).IEEE,2017:3854-3861.
[9]KEBEDE T M,DJANEYE-BOUNDJOU O,NARAYANAN B N,et al.Classification of malware programs using autoencoders based deep learning architecture and its application to the microsoft malware classification challenge(big 2015) dataset[C]//2017 IEEE National Aerospace and Electronics Conference(NAECON).IEEE,2017:70-75.
[10]HASSEN M,CARVALHO M M,CHAN P K.Malware classification using static analysis based features[C]//2017 IEEE Symposium Series on Computational Intelligence(SSCI).IEEE,2017:1-7.
[11]DREW J,MOORE T,HAHSLER M.Polymorphic malware detection using sequence classification methods[C]//2016 IEEE Security and Privacy Workshops(SPW).IEEE,2016:81-87.
[12]WANG Z W,LIU G Q,HAN X H,et al.Survey on Machine-learning-based Malware Identification Research[J].Journal of Chinese Computer Systems,2022,43(12):2628-2637.
[13]KRIZHEVSKY A,SUTSKEVER I,HINTON G E.ImageNet Classification with Deep Convolutional Neural Networks[J].Communications of the ACM,2017,60(6):84-90.
[14]DING Y X,ZHU S Y.Malware detection based on deep learning algorithm[J].Neural Computing and Applications,2019,31(2):461-472.
[15]RAFF E,BARKER J,SYLVESTER J,et al.Malware detection by eating a whole exe[C]//Workshops at the Thirty-Second AAAI Conference on Artificial Intelligence.2018.
[16]YAZDINEJAD A,HADDADPAJOUH H,DEHGHANTANHAA,et al.Cryptocurrency malware hunting:A deep recurrent neural network approach[J].Applied Soft Computing,2020,96:106630.
[17]YAN J,YAN G,JIN D.Classifying malware represented as control flow graphs using deep graph convolutional neural network[C]//2019 49th annual IEEE/IFIP International Conference on Dependable Systems and Networks(DSN).IEEE,2019:52-63.
[18]LE Q,BOYDELL O,MAC NAMEE B,et al.Deep learning at the shallow end:Malware classification for non-domain experts[J].Digital Investigation,2018,26:S118-S126.
[19]AZEEZ N A,ODUFUWA O E,MISRA S,et al.Windows PE malware detection using ensemble learning[J].Informatics,2021,8(1):1-22.
[20]YU Z,CAO R,TANG Q,et al.Order matters:semantic-aware neural networks for binary code similarity detection[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2020:1145-1152.
[21]AHMADI M,ULYANOV D,SEMENOV S,et al.Novel feature extraction,selection and fusion for effective malware family classification[C]//Proceedings of the sixth ACM Conference on Data and Application Security and Privacy.2016:183-194.
[22]XU X,LIU C,FENG Q,et al.Neural network-based graph embedding for cross-platform binary code similarity detection[C]//Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security.2017:363-376.
[23]HASSEN M,CHAN P K.Scalable function call graph-basedmalware classification[C]//Proceedings of the Seventh ACM on Conference on Data and Application Security and Privacy.2017:239-248.
[24]“pre-trained PalmTree model” [EB/OL].[2022-03-31].https://drive.google.com/file/d/1yC3M-kVTFWql6hCgM_QCbKtc1PbdVdvp/view/.
[25]KIPF T N,WELLING M.Semi-supervised classification withgraph convolutional networks[J].arXiv:1609.02907,2016.
[26]YING C,CAI T,LUO S,et al.Do Transformers Really Perform Badly for Graph Representation?[J].arXiv:2106.05234,2021.
[27]“DataCon” [EB/OL].[2021-12-31].https://datacon.qianxin.com/opendata/maliciouscode.
[28]MASSARELLI L,LUNA G A D,PETRONI F,et al.Safe:Self-attentive function embeddings for binary similarity[C]//International Conference on Detection of Intrusions and Malware,and Vulnerability Assessment.Cham:Springer,2019:309-329.
[29]XU X,LIU C,FENG Q,et al.Neural network-based graph embedding for cross-platform binary code similarity detection[C]//Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security.2017:363-376.
[30]ZUO F,LI X,YOUNG P,et al.Neural machine translation inspired binary code similarity comparison beyond function pairs[J].arXiv:1808.04706,2018.
[31]DING S H H,FUNG B C M,CHARLAND P.Asm2vec:Boosting static representation robustness for binary clone search against code obfuscation and compiler optimization[C]//2019 IEEE Symposium on Security and Privacy(SP).IEEE,2019:472-489.

Related Articles 15

[1]	LI Haixia, SONG Danlei, KONG Jianing, SONG Yafei, CHANG Haiyan. Evaluation of Hyperparameter Optimization Techniques for Traditional Machine Learning Models [J]. Computer Science, 2024, 51(8): 242-255.
[2]	ZHANG Daili, WANG Tinghua, ZHU Xinglin. Overview of Sample Reduction Algorithms for Support Vector Machine [J]. Computer Science, 2024, 51(7): 59-70.
[3]	ZHOU Tianyang, YANG Lei. Study on Client Selection Strategy and Dataset Partition in Federated Learning Basedon Edge TB [J]. Computer Science, 2024, 51(6A): 230800046-6.
[4]	SI Jia, LIANG Jianfeng, XIE Shuo, DENG Yingjun. Research Progress of Anomaly Detection in IaaS Cloud Operation Driven by Deep Learning [J]. Computer Science, 2024, 51(6A): 230400016-8.
[5]	WANG Zhaodan, ZOU Weiqin, LIU Wenjie. Buggy File Identification Based on Recommendation Lists [J]. Computer Science, 2024, 51(6A): 230600088-8.
[6]	LIU Wei, SONG You, ZHUO Peiyan, WU Weiqiang, LIAN Xin. Study on Kcore-GCN Anti-fraud Algorithm Fusing Multi-source Graph Features [J]. Computer Science, 2024, 51(6A): 230600040-7.
[7]	CHEN Xiangxiao, CUI Xin, DU Qin, TANG Haoyao. Study on Optimization of Abnormal Traffic Detection Model Based on Machine Learning [J]. Computer Science, 2024, 51(6A): 230700051-5.
[8]	TIAN Shuaihua, LI Zheng, WU Yonghao, LIU Yong. Identifying Coincidental Correct Test Cases Based on Machine Learning [J]. Computer Science, 2024, 51(6): 68-77.
[9]	JIA Fan, YIN Xiaokang, GAI Xianzhe, CAI Ruijie, LIU Shengli. Function-call Instruction Characteristic Analysis Based Instruction Set Architecture Recognization Method for Firmwares [J]. Computer Science, 2024, 51(6): 423-433.
[10]	LIN Binwei, YU Zhiyong, HUANG Fangwan, GUO Xianwei. Data Completion and Prediction of Street Parking Spaces Based on Transformer [J]. Computer Science, 2024, 51(4): 165-173.
[11]	WANG Degang, SUN Yi, GAO Qi. Active Membership Inference Attack Method Based on Multiple Redundant Neurons [J]. Computer Science, 2024, 51(4): 373-380.
[12]	WANG Xin, HUANG Weikou, SUN Lingyun. Survey of Incentive Mechanism for Cross-silo Federated Learning [J]. Computer Science, 2024, 51(3): 20-29.
[13]	LI Meng, DAI Haipeng, SUI Yongxi, GU Rong, CHEN Guihai. Survey of Learning-based Filters [J]. Computer Science, 2024, 51(1): 41-49.
[14]	ZHANG Wenqiong, LI Yun. Fairness Metrics of Machine Learning:Review of Status,Challenges and Future Directions [J]. Computer Science, 2024, 51(1): 266-272.
[15]	HUANG Shuxin, ZHANG Quanxin, WANG Yajie, ZHANG Yaoyuan, LI Yuanzhang. Research Progress of Backdoor Attacks in Deep Neural Networks [J]. Computer Science, 2023, 50(9): 52-61.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Cryptocurrency Mining Malware Detection Method Based on Sample Embedding

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0