Computer Science ›› 2024, Vol. 51 ›› Issue (1): 327-334.doi: 10.11896/jsjkx.230100116

• Information Security • Previous Articles     Next Articles

Cryptocurrency Mining Malware Detection Method Based on Sample Embedding

FU Jianming1, JIANG Yuqian1, HE Jia2, ZHENG Rui3, SURI Guga1, PENG Guojun1   

  1. 1 Key Laboratory of Aerospace Information Security and Trusted Computing, Ministry of Education, School of Cyber Science and Engineering, Wuhan University, Wuhan 430072, China
    2 Technology Center of Songshan Laboratory,Zhengzhou 450046,China
    3 College of Computer and Information Engineering,Henan University,Kaifeng,Henan 475000,China
  • Received:2023-01-30 Revised:2023-07-11 Online:2024-01-15 Published:2024-01-12
  • About author:FU Jianming,born in 1969,Ph.D,professor,Ph.D supervisor,is a member of CCF(No.07112S).His main research interests include system security and mobile security.
  • Supported by:
    National Natural Science Foundation of China(61972297,62172308,62272351) and National Key R & D Program of China(2021YFB3101201).

Abstract: Due to its high profitability and anonymity,cryptocurrency mining malware poses a great threat and loss to computer users.In order to confront the threat posed by mining malware,machine learning detectors based on software static features usually select a single type of static features,or integrate the detection results of different kinds of static features through integrated learning,ignoring the internal relationship between different kinds of static features,and its detection rate remains to be discussed.This paper starts from the internal hierarchical relationship of mining malware.It extracts basic blocks,control flow graphs and function call graphs of samples as static features,trains the three-layer model to embed these features into the vector respectively,and gradually gathers the features from the bottom to the top,and finally sends top features to the classifier to detect mining malware.To simulate the detection situation in real world,it first trains the model on a relatively smaller experimental data set,and then tests the performance of the model on another much larger data set.Experiment results show that the perfor-mance of th proposed method is much better than that of some machine learning models proposed in recent years.The recall rate and accuracy rate of three-layer-embedding model is more than 7% and 3% higher than that of other models,respectively.

Key words: Cryptocurrency mining malware, Static analysis, Machine learning, Graph embedding

CLC Number: 

  • TP311
[1]TEKINER E,ACAR A,ULUAGAC A S,et al.SoK:Crypto-jacking Malware[C]//2021 IEEE European Symposium on Security and Privacy(EuroS&P).IEEE,2021:120-139.
[2]PASTRANA S,SUAREZ-TANGIL G.A first look at the crypto-mining malware ecosystem:A decade of unrestricted wealth[C]//Proceedings of the Internet Measurement Conference.2019:73-86.
[3]360TS.Cryptominer,winstarnssmminer,has made a fortune bybrutally hijacking computers[EB/OL].[2021-12-31].https://blog.360totalsecurity.com/en/cryptominer-winstarnssmminer-made-fortune-brutally-hijacking-computer.
[4]TAHIR R,HUZAIFA M,DAS A,et al.Mining on someoneelse’s dime:Mitigating covert mining operations in clouds and enterprises[C]//International Symposium on Research in Attacks,Intrusions,and Defenses.Cham:Springer,2017:287-310.
[5]ESENTIRE I.Cryptocurrency craze drives 1,500% increase in coin-mining malware[EB/OL].[2021-12-31].https://www.esentire.com/news-releases/2018s-cryptocurrency-craze-helps-drive-1500-percent-increase-in-coinmining-malware.
[6]GRIFFTHS J.Coinminers target vulnerable users as bitcoin hits all-time high,[EB/OL].[2021-12-31].https://www.avira.com/en/blog/coinminers-target-vulnerable-users-as-bitcoin-hits-all-time-high/.
[7]YAN G.Be sensitive to your errors:Chaining neyman-pearsoncriteria for automated malware classification[C]//Proceedings of the 10th ACM Symposium on Information,Computer and Communications Security.2015:121-132.
[8]YOUSEFI-AZAR M,VARADHARAJAN V,HAMEY L,et al.Autoencoder-based feature learning for cyber security applications[C]//2017 International Joint Conference on Neural Networks(IJCNN).IEEE,2017:3854-3861.
[9]KEBEDE T M,DJANEYE-BOUNDJOU O,NARAYANAN B N,et al.Classification of malware programs using autoencoders based deep learning architecture and its application to the microsoft malware classification challenge(big 2015) dataset[C]//2017 IEEE National Aerospace and Electronics Conference(NAECON).IEEE,2017:70-75.
[10]HASSEN M,CARVALHO M M,CHAN P K.Malware classification using static analysis based features[C]//2017 IEEE Symposium Series on Computational Intelligence(SSCI).IEEE,2017:1-7.
[11]DREW J,MOORE T,HAHSLER M.Polymorphic malware detection using sequence classification methods[C]//2016 IEEE Security and Privacy Workshops(SPW).IEEE,2016:81-87.
[12]WANG Z W,LIU G Q,HAN X H,et al.Survey on Machine-learning-based Malware Identification Research[J].Journal of Chinese Computer Systems,2022,43(12):2628-2637.
[13]KRIZHEVSKY A,SUTSKEVER I,HINTON G E.ImageNet Classification with Deep Convolutional Neural Networks[J].Communications of the ACM,2017,60(6):84-90.
[14]DING Y X,ZHU S Y.Malware detection based on deep learning algorithm[J].Neural Computing and Applications,2019,31(2):461-472.
[15]RAFF E,BARKER J,SYLVESTER J,et al.Malware detection by eating a whole exe[C]//Workshops at the Thirty-Second AAAI Conference on Artificial Intelligence.2018.
[16]YAZDINEJAD A,HADDADPAJOUH H,DEHGHANTANHAA,et al.Cryptocurrency malware hunting:A deep recurrent neural network approach[J].Applied Soft Computing,2020,96:106630.
[17]YAN J,YAN G,JIN D.Classifying malware represented as control flow graphs using deep graph convolutional neural network[C]//2019 49th annual IEEE/IFIP International Conference on Dependable Systems and Networks(DSN).IEEE,2019:52-63.
[18]LE Q,BOYDELL O,MAC NAMEE B,et al.Deep learning at the shallow end:Malware classification for non-domain experts[J].Digital Investigation,2018,26:S118-S126.
[19]AZEEZ N A,ODUFUWA O E,MISRA S,et al.Windows PE malware detection using ensemble learning[J].Informatics,2021,8(1):1-22.
[20]YU Z,CAO R,TANG Q,et al.Order matters:semantic-aware neural networks for binary code similarity detection[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2020:1145-1152.
[21]AHMADI M,ULYANOV D,SEMENOV S,et al.Novel feature extraction,selection and fusion for effective malware family classification[C]//Proceedings of the sixth ACM Conference on Data and Application Security and Privacy.2016:183-194.
[22]XU X,LIU C,FENG Q,et al.Neural network-based graph embedding for cross-platform binary code similarity detection[C]//Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security.2017:363-376.
[23]HASSEN M,CHAN P K.Scalable function call graph-basedmalware classification[C]//Proceedings of the Seventh ACM on Conference on Data and Application Security and Privacy.2017:239-248.
[24]“pre-trained PalmTree model” [EB/OL].[2022-03-31].https://drive.google.com/file/d/1yC3M-kVTFWql6hCgM_QCbKtc1PbdVdvp/view/.
[25]KIPF T N,WELLING M.Semi-supervised classification withgraph convolutional networks[J].arXiv:1609.02907,2016.
[26]YING C,CAI T,LUO S,et al.Do Transformers Really Perform Badly for Graph Representation?[J].arXiv:2106.05234,2021.
[27]“DataCon” [EB/OL].[2021-12-31].https://datacon.qianxin.com/opendata/maliciouscode.
[28]MASSARELLI L,LUNA G A D,PETRONI F,et al.Safe:Self-attentive function embeddings for binary similarity[C]//International Conference on Detection of Intrusions and Malware,and Vulnerability Assessment.Cham:Springer,2019:309-329.
[29]XU X,LIU C,FENG Q,et al.Neural network-based graph embedding for cross-platform binary code similarity detection[C]//Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security.2017:363-376.
[30]ZUO F,LI X,YOUNG P,et al.Neural machine translation inspired binary code similarity comparison beyond function pairs[J].arXiv:1808.04706,2018.
[31]DING S H H,FUNG B C M,CHARLAND P.Asm2vec:Boosting static representation robustness for binary clone search against code obfuscation and compiler optimization[C]//2019 IEEE Symposium on Security and Privacy(SP).IEEE,2019:472-489.
[1] LI Meng, DAI Haipeng, SUI Yongxi, GU Rong, CHEN Guihai. Survey of Learning-based Filters [J]. Computer Science, 2024, 51(1): 41-49.
[2] ZHANG Wenqiong, LI Yun. Fairness Metrics of Machine Learning:Review of Status,Challenges and Future Directions [J]. Computer Science, 2024, 51(1): 266-272.
[3] HUANG Shuxin, ZHANG Quanxin, WANG Yajie, ZHANG Yaoyuan, LI Yuanzhang. Research Progress of Backdoor Attacks in Deep Neural Networks [J]. Computer Science, 2023, 50(9): 52-61.
[4] WANG Yao, LI Yi. Termination Analysis of Single Path Loop Programs Based on Iterative Trajectory Division [J]. Computer Science, 2023, 50(9): 108-116.
[5] LI Haiming, ZHU Zhiheng, LIU Lei, GUO Chenkai. Multi-task Graph-embedding Deep Prediction Model for Mobile App Rating Recommendation [J]. Computer Science, 2023, 50(9): 160-167.
[6] LI Ke, YANG Ling, ZHAO Yanbo, CHEN Yonglong, LUO Shouxi. EGCN-CeDML:A Distributed Machine Learning Framework for Vehicle Driving Behavior Prediction [J]. Computer Science, 2023, 50(9): 318-330.
[7] LIU Xiang, ZHU Jing, ZHONG Guoqiang, GU Yongjian, CUI Liyuan. Quantum Prototype Clustering [J]. Computer Science, 2023, 50(8): 27-36.
[8] WANG Yu, WANG Zuchao, PAN Rui. Survey of DGA Domain Name Detection Based on Character Feature [J]. Computer Science, 2023, 50(8): 251-259.
[9] LI Yang, LI Zhenhua, XIN Xianlong. Attack Economics Based Fraud Detection for MVNO [J]. Computer Science, 2023, 50(8): 260-270.
[10] ZHU Boyu, CHEN Xiao, SHA Letian, XIAO Fu. Two-layer IoT Device Classification Recognition Model Based on Traffic and Text Fingerprints [J]. Computer Science, 2023, 50(8): 304-313.
[11] LU Xingyuan, CHEN Jingwei, FENG Yong, WU Wenyuan. Privacy-preserving Data Classification Protocol Based on Homomorphic Encryption [J]. Computer Science, 2023, 50(8): 321-332.
[12] WANG Dongli, YANG Shan, OUYANG Wanli, LI Baopu, ZHOU Yan. Explainability of Artificial Intelligence:Development and Application [J]. Computer Science, 2023, 50(6A): 220600212-7.
[13] WANG Xiya, ZHANG Ning, CHENG Xin. Review on Methods and Applications of Text Fine-grained Emotion Recognition [J]. Computer Science, 2023, 50(6A): 220900137-7.
[14] WANG Jinjin, CHENG Yinhui, NIE Xin, LIU Zheng. Fast Calculation Method of High-altitude Electromagnetic Pulse Environment Based on Machine Learning [J]. Computer Science, 2023, 50(6A): 220500046-5.
[15] YIN Xingzi, PENG Ningning, ZHAN Xueyan. Filtered Feature Selection Algorithm Based on Persistent Homology [J]. Computer Science, 2023, 50(6): 159-166.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!