计算机科学 ›› 2023, Vol. 50 ›› Issue (6A): 220100255-7.doi: 10.11896/jsjkx.220100255
赵晨霞1,2, 舒辉2, 沙子涵2
ZHAO Chenxia1,2, SHU Hui2, SHA Zihan2
摘要: 在信息安全领域,加密技术被用来保障信息的安全性,在可执行文件中识别密码算法对于保护信息安全有着重要意义。现有密码算法识别技术大多只能针对单一架构,在跨架构场景下识别能力较差,因此,提出了IR2Vec模型,着力解决跨架构下的密码算法识别问题。该模型首先基于LLVM衔接不同的前端和后端的特性来解决跨架构的问题,利用LLVM-RetDec将可执行文件反编译成中间表示语言,然后改进PV-DM模型将中间表示语言语义向量化,通过求取向量的余弦距离来判断语义相似性。收集多种密码算法来建立密码算法库,将待检测目标可执行文件分别与密码算法库中的文件进行一一对比,取相似度最高的为识别结果。实验结果表明,该技术能够有效识别出可执行文件中的密码算法,该模型可同时支持X86,ARM和MIPS 3种架构,Clang和GCC两种编译器,以及O0,O1,O2和O3这4种优化选项的二进制文件交叉识别。
中图分类号:
[1]ESCHWEILER S,YAKDAN K,GERHARDS-PADILLA E.discovRE:Efficient Cross-Architecture Identification of Bugs in Binary Code[C]//The Network and Distributed System Security Symposium(NDSS 2016).2016. [2]LU T L,WU J,BAO Y,et al.Computer virus analysis and simulation based on cryptographic algorithm detection[J].Computer Simulation,2020,37(11):173-178. [3]MATENAAR F,WICHMANN A,LEDER F,et al.CIS:TheCrypto Intelligence System for automatic detection and localization of cryptographic functions in current malware[C]//International Conference on Malicious & Unwanted Software.IEEE,2012. [4]LI X,WANG X,CHANG W.CipherXRay:Exposing Cryptographic Operations and Transient Secrets from Monitored Binary Execution[J].IEEE Transactions on Dependable & Secure Computing,2014,11(2):101-114. [5]HILL G D,BELLEKENS X.Deep Learning Based Cryptogra-phic Primitive Classification[J].arXiv:1709.08385,2017. [6]HILL,GREGORY,BELLEKENS,et al.CryptoKnight:Generating and Modelling Compiled Cryptographic Primitives[J].Information,2018,9(9):231. [7]CALVET J,FERNANDEZ J M,MARION J Y.Aligot Cryptographic Function Identification in Obfuscated Binary Programs[C]//Proceeding of the 2012 ACM Conference on Computer and Communications Security.2012:169-182. [8]LI J Z,JIANG L H,SHU H,et al.Cryptographic functionscreening based on dynamic cyclic information entropy[J].Computer Applications,2014,34(4):1025-1028,1033. [9]LI J Z,JIANG L H,SHU H.Binary code level cryptographic algorithm cyclic feature recognition[J].Computer Engineering and Design,2014,35(8):2628-2632. [10]XU D,JIANG M,WU D.Cryptographic Function Detection in Obfuscated Binaries via Bit-Precise Symbolic Loop Mapping[C]//2017 IEEE Symposium on Security and Privacy(SP).IEEE,2017. [11]BENEDETTIA L,AURÉLIEN T,CYBERSECURITY J F A.Detection of cryptographic algorithms with grap[J/OL].https://github.com/AirbusCyber/grap. [12]LESTRINGANT P,GUIHÉRY F,FOUQUE P A.Automated Identification of Cryptographic Primitives in Binary Code with Data Flow Graph Isomorphism[C]//Proceedings of the 10th ACM Symposium on Information,Computer and Communications Security.2015:203-214. [13]MEIJER C,MOONSAMY V,WETZELS J.Where’s Crypto?:Automated Identification and Classification of Proprietary Cryptographic Primitives in Binary Code[C]//USNIX Security Symposium.2021:555-572. [14]QIAN F,ZHOU R,XU C,et al.Scalable Graph-based BugSearch for Firmware Images[C]//ACM Sigsac Conference on Computer & Communications Security.2016:480-491. [15]NG A Y,JORDAN M I,WEISS Y,et al.On Spectral Cluste-ring:Analysis and an algorithm[C]//Advances in Neural Information Processing Systems.2002:849-856. [16]XU X,LIU C,FENG Q,et al.Neural network-based graph embedding for cross-platform binary code similarity detection[C]//Proceedings of the 2017 ACM SIGSAC Confe-rence on Compu-ter and Communications Security.2017:363-376. [17]DAI H,DAI B,SONG L.Discriminative embeddings of latentvariable models for structured data[C]//International Confe-rence on Machine Learning.2016:2702-2711. [18]LAGEMAN N,KILMER E D,WALLS R J,et al.BinDNN:Resilient Function Matching Using Deep Learning[C]//International Conference on Security and Privacy in Communication Systems.Cham:Springer,2016. [19]HU Y,ZHANG Y,LI J,et al.Binary Code Clone Detectionacross Architectures and Compiling Configurations[C]//IEEE/ACM International Conference on Program Comprehension.IEEE Computer Society,2017. [20]DING S H H,FUNG B C M,CHARLAND P.Asm2vec:Boosting static representation robustness for binary clone search against code obfuscation and compiler optimization[C]//2019 IEEE Symposium on Security and Privacy(SP).IEEE,2019:472-489. [21]LUO Z,WANG B,TANG Y,et al.Semantic-Based Representation Binary Clone Detection for Cross-Architectures in the Internet of Things[J].Applied Sciences,2019,9(16):3283. [22]MASSARELLI L,LUNA G,PETRONI F,et al.SAFE:Self-Attentive Function Embeddings for Binary Similarity[C]//International Conference on Detection of Intrusions and Malware,and Vulnerability Assessment (DIMVA 2019).2019:309-329. [23]REN X,HO M,MING J,et al.Unleashing the Hidden Power of Compiler Optimization on Binary Code Difference:An Empirical Study[C]//Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation.2021:142-157. |
|