基于约束推导式的增强型二进制漏洞挖掘

doi:10.11896/jsjkx.200700047

计算机科学 ›› 2021, Vol. 48 ›› Issue (3): 320-326.doi: 10.11896/jsjkx.200700047

基于约束推导式的增强型二进制漏洞挖掘

郑建云, 庞建民, 周鑫, 王军

数学工程与先进计算国家重点实验室郑州450002

收稿日期:2020-07-09 修回日期:2020-08-13 出版日期:2021-03-15 发布日期:2021-03-05
通讯作者: 庞建民(jianmin_pang@hotmail.com)
作者简介:jianyunzheng@126.com
基金资助:
国家自然科学基金(61802433;61802435);之江实验室(2018FD0ZX01)

Enhanced Binary Vulnerability Mining Based on Constraint Derivation

ZHENG Jian-yun, PANG Jian-min, ZHOU Xin, WANG Jun

State Key Laboratory of Mathematical Engineering and Advanced Computing,Zhengzhou 450002,China

Received:2020-07-09 Revised:2020-08-13 Online:2021-03-15 Published:2021-03-05
About author:ZHENG Jian-yun,born in 1987,master.His main research interests include computer architecture,information security and machine learning.
PANG Jian-min,born in 1964,Ph.D,professor,Ph.D supervisor,is a senior member of China Computer Federation.His main research interests include computer architecture,information security and high performance computing.
Supported by:
National Natural Science Foundation of China(61802433,61802435) and Zhijiang Lab(2018FD0ZX01).

摘要/Abstract

摘要： 近年来,使用软件相似性方法挖掘软件中的同源漏洞已经被证明确实有效,但现有的方法在准确率方面还存在一定不足。在原有的软件相似性方法的基础上,文中提出了一种基于约束推导式的增强型相似性方法。该方法引入代码规范化和标准化处理技术以减少编译环境引起的噪声,使得同源程序在不同编译条件下的反编译代码表示尽量趋于相同;使用程序后向切片技术提取漏洞函数和漏洞补丁函数的约束推导式,通过两者约束推导式的相似性比较,过滤掉易被误判为漏洞函数的补丁函数,以减少漏洞挖掘结果中的误报。基于所提方法,实现了新的漏洞挖掘系统VulFind。实验结果表明,该系统可以提升软件相似性分析的准确率,使得漏洞挖掘结果的准确率得到有效提升。

关键词: 代码规范化, 二进制代码分析, 漏洞挖掘, 软件相似性, 约束推导式

Abstract: In recent years,using software similarity methods to mine the homologous vulnerabilities has been proved to be effective,but the existing methods still have some shortcomings in accuracy.Based on the existing software similarity methods,this paper proposes an enhanced similarity method based on constraint derivation.This method uses code normalizationand standardization to reduce the compilation noise,so that the decompiled code representations of homologous programs tend to be the same under different compilation conditions.By using the backward slicing technique,it extracts the constraint derivation of vulnerability function and vulnerability patch function.By comparing the similarity of two constraint derivations,the patch function that is easily misjudged as vulnerability function is filtered out,so as to reduce false positives of vulnerability miningresults.We implement a prototype called VulFind.Experimental results show that VulFind caneffectivelyimprove the accuracy of software similarity analysis and vulnerability mining results.

Key words: Binary code analysis, Code normalization, Constraint derivation, Software similarity, Vulnerability mining

中图分类号:

TP311

郑建云, 庞建民, 周鑫, 王军. 基于约束推导式的增强型二进制漏洞挖掘[J]. 计算机科学, 2021, 48(3): 320-326. https://doi.org/10.11896/jsjkx.200700047

ZHENG Jian-yun, PANG Jian-min, ZHOU Xin, WANG Jun. Enhanced Binary Vulnerability Mining Based on Constraint Derivation[J]. Computer Science, 2021, 48(3): 320-326. https://doi.org/10.11896/jsjkx.200700047

参考文献

[1]KRSUL I V.Software vulnerability analysis[M].Purdue Uni-versity,1998.
[2]ZOU Q C,ZHANG T,WU R P,et al.From automation to intelligence:Survey of research on vulnerability discovery techniques[J].Journal of Tsinghua University (Science and Technology),2018,58(12):1079-1094.
[3]XIONG H,YAN H H,GUO T,et al.Code Similarity Detection:A Survey[J].Computer Science,2010,37(8):9-13.
[4]LI Z,BIAN P,SHI W C,et al.Approach of leveraging patches to discover unknown vulnerabilities[J].Ruan Jian Xue Bao/Journal of Software,2018,29(5):1199-1212.
[5]DAVID Y,YAHAV E.Tracelet-based code search in executables[C]//PLDI ’14.Edinburgh,United Kingdom,2013:349-360.
[6]FENG Q,ZHOU R D,XU C C,et al.Scalable Graph-based Bug Search for Firmware Images[C]//ACM SIGSAC Conference on Computer & Communication Security.ACM,2016.
[7]DAVID Y,PARTUSH N,YAHAV E,et al.Statistical similarity of binaries[J].Programming Language Design and Implementation,2016,51(6):266-280.
[8]XU X J,LIU C,FENG Q,et al.Neural Network-based GraphEmbedding for Cross-Platform Binary Code Similarity Detection[C]//Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security(CCS ’17).2017:363-376.
[9]ZHANG X,PANG J,LIU X,et al.Common Program Similarity Metric Method for Anti-Obfuscation[J].IEEE Access,2018:47557-47565.
[10]https://www.openssl.org/news/vulnerabilities.html.
[11]XIONG H,YAN H H,GUO T,et al.Code Similarity Detection:A Survey[J].Computer Science,2010,37(8):9-13.
[12]DAVID Y,PARTUSH N,YAHAV E.Similarity of binariesthrough reoptimization[C]//PLDI 2017.Barcelona,Spain,2017:79-94.
[13]FENG Q,WANG M H,MU Z,et al.Extracting ConditionalFormulas for Cross-Platform Bug Search[C]//ASIA CCS ’17.2017:346-359.
[14]https://llvm.org/docs/LangRef.html.
[15]JHALA R,MAJUMDAR R.Path slicing[J].Acm Sigplan Notices,2005,40(6):38-47.
[16]DIAO X C,TAN M C,CAO J J.New method of character string similarity compute based on fusing multiple edit distances[J].Application Research of Computers,2010(12):4523-4525.
[17]DENG D,LI G,FENG J,et al.Top-k String Similarity Search with Edit-Distance Constraints[C]//2013 IEEE 29th International Conference Data Engineering (ICDE).IEEE,2013.
[18]KUHN H W.The Hungarian method for the assignment pro-blem[J].Naval Research Logistics,2010,52(1/2):7-21.
[19]https://retdec.com.
[20]DAI H,DAI B,SONG L.Discriminative Embeddings of Latent Variable Models for Structured Data[J].arXiv:1603.05629.
[21]RIBEIRO L F R,SAVERESE P H P,FIGUEIREDO D R.struc2vec:Learning Node Representations from Structural Identity[C]//the 23rd ACM SIGKDD International Conference.ACM,2017.
[22]GAO J,YANG X,FU Y,et al.VulSeeker:A Semantic Learning Based Vulnerability Seeker for Cross-Platform Binary[C]//2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE).IEEE,2020.
[23]CHOPRA S,HADSELL R,LECUN Y.Learning a similaritymetric discriminatively,with application to face verification[C]//2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05).IEEE,2005.
[24]SANDOUK U,CHEN K.Learning contextualized semantics from co-occurring terms via a Siamese architecture[M].Elsevier Science Ltd.,2016.

相关文章 9

[1]	李毅豪, 洪征, 林培鸿. 基于深度优先搜索的模糊测试用例生成方法 Fuzzing Test Case Generation Method Based on Depth-first Search 计算机科学, 2021, 48(12): 85-93. https://doi.org/10.11896/jsjkx.200800178
[2]	李佳莉, 陈永乐, 李志, 孙利民. 基于协议状态图遍历的RTSP协议漏洞挖掘 Mining RTSP Protocol Vulnerabilities Based on Traversal of Protocol State Graph 计算机科学, 2018, 45(9): 171-176. https://doi.org/10.11896／j.issn.1002-137X.2018.09.028
[3]	锁延锋,王少杰,秦宇,李秋香,丰大军,李京春. 工业控制系统的安全技术与应用研究综述 Summary of Security Technology and Application in Industrial Control System 计算机科学, 2018, 45(4): 25-33. https://doi.org/10.11896/j.issn.1002-137X.2018.04.004
[4]	张亚丰,洪征,吴礼发,周振吉,孙贺. 基于状态的工控协议Fuzzing测试技术 Protocol State Based Fuzzing Method for Industrial Control Protocols 计算机科学, 2017, 44(5): 132-140. https://doi.org/10.11896/j.issn.1002-137X.2017.05.024
[5]	黄寿孟,高华玲,潘玉霞. 软件相似性分析算法的研究综述 Summary of Research on Similarity Analysis of Software 计算机科学, 2016, 43(Z6): 467-470. https://doi.org/10.11896/j.issn.1002-137X.2016.6A.110
[6]	张雄,李舟军. 模糊测试技术研究综述 Survey of Fuzz Testing Technology 计算机科学, 2016, 43(5): 1-8. https://doi.org/10.11896/j.issn.1002-137X.2016.05.001
[7]	侯莹,洪征,潘增,吴礼发. 基于模型的Fuzzing测试脚本自动化生成 Model Based Automatic Fuzzing Script Generation 计算机科学, 2013, 40(3): 206-209.
[8]	史飞悦,傅德胜. 缓冲区溢出漏洞挖掘分析及利用的研究 Research of Buffer Overflow Vulnerability Discovering Analysis and Exploiting 计算机科学, 2013, 40(11): 143-146.
[9]	陈韬，孙乐昌，潘祖烈，刘京菊. 基于文件格式的漏洞挖掘技术研究 Research on Software Vulnerability Mining Technique Based on File-format 计算机科学, 2011, 38(Z10): 78-82.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

基于约束推导式的增强型二进制漏洞挖掘

Enhanced Binary Vulnerability Mining Based on Constraint Derivation

PDF (PC)

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 9

Metrics

本文评价

推荐阅读 0