计算机科学 ›› 2019, Vol. 46 ›› Issue (7): 126-132.doi: 10.11896/j.issn.1002-137X.2019.07.020

• 软件与数据库技术 • 上一篇    下一篇

一种基于程序切片相似度匹配的脆弱性发现方法

刘强1,2,况晓辉1,3,陈华1,李响1,李广轲3   

  1. (军事科学院系统工程研究院信息系统安全技术国家级重点实验室 北京100101)1
    (清华大学计算机科学与技术系 北京100084)2
    (北京邮电大学 北京100876)3
  • 收稿日期:2018-06-09 出版日期:2019-07-15 发布日期:2019-07-15
  • 作者简介:刘 强(1979-),男,博士生,工程师,CCF学生会员,主要研究方向为网络与信息安全,E-mail:liuqiang_xjtu@163.com;况晓辉(1975-),男,教授,硕士生导师,CCF高级会员,主要研究方向为网络与信息安全,E-mail:xiaohui_kuang@163.com(通信作者);陈 华(1962-),女,研究员,硕士生导师,主要研究方向为信息安全;李 响(1983-),女,工程师,主要研究方向为网络与信息安全;李广轲(1995-),男,硕士生,主要研究方向为网络与信息安全。

Vulnerability Discovery Approach Based on Similarity Matching of Program Slicing

LIU Qiang1,2,KUANG Xiao-hui1,3,CHEN Hua1,LI Xiang1,LI Guang-ke3   

  1. (National Key Laboratory of Science and Technology on Information System Security,Institute of System and Engineering,Academy of Military Science,Beijing 100101,China)1
    (Department of Computer Science and Technology,Tsinghua University,Beijing 100084,China)2
    (Beijing University of Posts and Telecommunications,Beijing 100876,China)3
  • Received:2018-06-09 Online:2019-07-15 Published:2019-07-15

摘要: 基于脆弱性代码的相似度匹配是静态发现脆弱性的有效方法之一,在不降低漏报率的情况下如何降低误报率和提升分析效率是该方法优化的主要目标。针对这一挑战,提出了基于代码切片相似度匹配的脆弱性发现框架。文中研究了基于关键点的代码切片、特征抽取和向量化的方法,主要思想是以脆弱性代码的脆弱性语义上下文切片作为参照物,通过计算被测代码的切片与脆弱性样本切片的相似性来判断存在脆弱性的可能性。文中实现了该方法,并以已知脆弱性的开源项目为分析对象进行了验证。与已有研究的对比实验表明,切片相似度能更准确地刻画脆弱性上下文,通过切片技术优化了基于相似度匹配的脆弱性发现方法,有效降低了脆弱性发现的误报率和漏报率,验证了所提框架和方法的有效性。

关键词: 程序切片, 脆弱性分析, 相似度匹配

Abstract: Vulnerability analysis method based on similarity matching is one of the most effective vulnerability analysis methods.How to reduce the false positive rate without reducing the false negative rate,and increase the efficiency,are the main goals to optimize the method.Aiming at these challenges,this paper proposed an optimization vulnerability analysis framework based on similarity matching of program slices.In the framework,the methods of code slice,feature extraction and vectorization based on vulnerable key points were studied.The core ideal of the framework is taking vulnerability semantic context slice of the vulnerability code as a reference to calculate the similarity between the slice of the tested code and the vulnerable sample slice,and determining the likelihood of a vulnerability.This paper implemented this framework and validateed it with open source projects with known vulnerabilities.Compared with the existing research,the vulnerability slice similarity framework has the ability to close describe the vulnerability context,and the vulnerability discovery method based on similarity matching is optimized by the slice technique.The proposed framework and method are verified to effectively reduce the false positive rate and false negative rate ofvulnerabi-lity discovery.

Key words: Programing slicing, Similarity matching, Vulnerability analysis

中图分类号: 

  • TP393
[1]WU S Z,GUO T,DONG G W,et al.Software vulnerability analyses:A road map .Journal of Tsinghua University (Science and Technology),2012,52(10):1309-1319.(in Chinese)
吴世忠,郭涛,董国伟,等.软件漏洞分析技术进展[J].清华大学学报(自然科学版),2012,52(10):1309-1319.
[2]WIJAYASEKARA D,MANIC M,WRIGHT J,et al.Mining bug databases for unidentified software vulnerabilities[C]∥2012 5th International Conference on Human System Interactions (HSI).IEEE,2012.
[3]NEUHAUS S,ZIMMERMANN T,HOLLER C,et al.Predicting vulnerable software components[C]∥Proceedings of the 14th ACM Conference on Computer and Communications Security.ACM,2007.
[4]WEISER M.Program slicing∥Proceedings of the 5th International Conference on Software Engineering.IEEE Press,1981:439-449.
[5]KOREL B,LASKI J.Dynamic program slicing.Information processing letters,1988,29(3):155-163.
[6]GALLAGHER K B,LYLE J R.Using program slicing in software maintenance.IEEE Transactions on Software Enginee-ring,1991,17(8):751-761.
[7]HIERONS R,HARMAN M,DANICIC S.Using program sli- cing to assist in the detection of equivalent mutants.Software Testing Verification and Reliability,1999,9(4):233-262.
[8]VENKATESH G A.The semantic approach to program slicing∥ACM SIGPLAN Notices.ACM,1991,26(6):107-119.
[9]FIELD J,RAMALINGAM G,TIP F.Parametric program sli- cing∥Proceedings of the 22nd ACM SIGPLAN-SIGACT symposium on Principles of programming languages.ACM,1995:379-392.
[10]CANFORA G,CIMITILE A,DE LUCIA A.Conditioned program slicing.Information and Software Technology,1998,40(11-12):595-607.
[11]AGRAWAL H,DEMILLO R A,SPAFFORD E H.Debugging with dynamic slicing and backtracking.Software:Practice and Experience,1993,23(6):589-616.
[12]OTTENSTEIN K J,OTTENSTEIN L M.The program depen- dence graph in a software development environment∥ACM Sigplan Notices.ACM,1984,19(5):177-184.
[13]DANICIC S,HARMAN M,SIVAGURUNATHAN Y.A parallel algorithm for static program slicing.Information Proces-sing Letters,1995,56(6):307-313.
[14]BERGERETTI J F,CARR B A.Information-flow and data-flow analysis of while-programs.ACM Transactions on Programming Languages and Systems (TOPLAS),1985,7(1):37-61.
[15]ZHANG X J.Program slicing technology research and slicing scheme design .Chengdu:University of Electronic Science and Technology,2017.(in Chinese)
张新杰.程序切片技术研究及切片方案设计.成都:电子科技大学,2017.
[16]XU B,QIAN J,ZHANG X,et al.A brief survey of program slicing.ACM SIGSOFT Software Engineering Notes,2005,30(2):1-36.
[17]TIP F.A survey of program slicing techniques.Centrum voor Wiskunde en Informatica,1994.
[18]李必信.程序切片技术及其应用.北京:科学出版社,2006.
[19]Wisconsin Program-Slicing Project.The Wisconsin Program-Slicing Tool,Version 1.0.1[OL].http://www.cs.wisc.edu/wpis/slicing_tool.
[20]BALAKRISHNAN G,GRUIAN R,REPS T,et al.CodeSurfer/x86-A platform for analyzing x86 executables∥International Conference on Compiler Construction.Springer,Berlin,Heidelberg,2005:250-254.
[21]CUOQ P,KIRCHNER F,KOSMATOV N,et al.Framac∥International Conference on Software Engineering and Formal Methods.Springer,Berlin,Heidelberg,2012:233-247.
[22]SCHWARTZ-NARBONNE D,OH C,SCHF M,et al.VERMEER:A tool for tracing and explaining faulty C programs∥2015 IEEE/ACM 37th IEEE International Conference on Software Engineering (ICSE).IEEE,2015:737-740.
[23]ALOMARI H W,COLLARD M L,MALETIC J I,et al.srcSlice:very efficient and scalable forward static slicing.Journal of Software:Evolution and Process,2014,26(11):931-961.
[24]NEWMAN C D,SAGE T,COLLARD M L,et al.srcSlice:a tool for efficient static forward slicing∥IEEE/ACM Internatio-nal Conference on Software Engineering Companion (ICSE-C).IEEE,2016:621-624.
[25]LI R Q,ZENG G B.Slice Abstract Extraction in Program Source Code and Its Application in Search.Information Technology & Network Security,2018,37(3):122-125.(in Chinese)
李润青,曾国荪.程序源代码中的切片摘要提取及在搜索中的应用.信息技术与网络安全,2018,37(3):122-125.
[26]FENG Q,WANG M,ZHANG M,et al.Extracting conditional formulas for cross-platform bug search[C]∥ASIACCS.2017.
[27]YAMAGUCHI F,FELIX L,KONRAD R.Vulnerability extra- polation:Assisted discovery of vulnerabilities using machine learning[C]∥Proceedings of the 5th USENIX Conference on Offensive Technologies.USENIX Association,2011.
[28]YAMAGUCHI F,MARKUS L,KONRAD R.Generalized vulnerability extrapolation using abstract syntax trees[C]∥Proceedings of the 28th Annual Computer Security Applications Conference.ACM,2012.
[29]YAMAGUCHI F,et al.Chucky:exposing missing checks in source code for vulnerability discovery[C]∥Proceedings of the 2013 ACM SIGSAC Conference on Computer & Communications Security.ACM,2013.
[30]YAMAGUCHI F,MAIER A,GASCON H,et al.Automatic inference of search patterns for taint-style vulnerabilities∥2015 IEEE Symposium on Security and Privacy (SP).IEEE,2015:797-812.
[31]XU X,LIU C,FENG Q,et al.Neural Network-based Graph Embedding for Cross-Platform Binary Code Similarity Detection∥Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security.ACM,2017:363-376.
[32]GAN S T,QIN X J,CHEN Z N,et al.Software vulnerability code clone detection method based on characteristic metrics[J].Journal of Software,2015,26(2):348-363 (in Chinese). 甘水滔,秦晓军,陈左宁,等.一种基于特征矩阵的软件脆弱性代码克隆检测方法.软件学报,2015,26(2):348-363.
[33]https://samate.nist.gov/SRD/testsuites/juliet/Juliet_Test_Sui-te_v1.3_for_C_Cpp.zip.
[1] 况晓辉,李津,赵刚.
一种基于多层拓扑的大规模分布式系统结构脆弱性分析算法
Multilayer Topology Structural Vulnerability Analysis Algorithm for Large-scale Distributed System
计算机科学, 2016, 43(8): 26-29. https://doi.org/10.11896/j.issn.1002-137X.2016.08.005
[2] 张武能,李宏伟,沈立炜,赵文耘.
面向数据库模式变更的代码演化推荐方法
Method of Code Evolution Recommendation for Database Schema Change
计算机科学, 2016, 43(2): 216-223. https://doi.org/10.11896/j.issn.1002-137X.2016.02.046
[3] 张广梅,李景霞.
利用变量状态转换模型进行部分软件错误的检测
Detecting Software Error by Using State Transition Model of Variable
计算机科学, 2015, 42(Z11): 504-507.
[4] 李昂,毛晓光,雷晏.
面向自动修复并融合失效场景的缺陷定位方法
Fault Localization Using Failure-related Contexts for Automatic Program Repair
计算机科学, 2015, 42(12): 102-104.
[5] 姜伟伟,刘光杰,戴跃伟.
基于Snort的Modbus TCP工控协议异常数据检测规则设计
Design of Modbus TCP Industrial Control Network Protocol Abnormal Data Detection Rules Based on Snort
计算机科学, 2015, 42(11): 212-216. https://doi.org/10.11896/j.issn.1002-137X.2015.11.044
[6] 王志文,黄小龙,王海军,刘烃,俞乐晨.
基于程序切片的测试用例生成系统研究与实现
Program Slicing-guied Test Case Generation System
计算机科学, 2014, 41(9): 71-74. https://doi.org/10.11896/j.issn.1002-137X.2014.09.012
[7] 张颖淳,王淑良.
基于复杂网络理论的电力基础设施网络关键组件辨识
Critical Components Identification of Power Infrastructure Systems Based on Complex Network Theory
计算机科学, 2014, 41(5): 275-279. https://doi.org/10.11896/j.issn.1002-137X.2014.05.058
[8] 陈翔,顾卫江,徐慧,顾庆,陈道蓄.
回归测试用例选择技术研究综述
Regression Testing Selection Techniques:A State-of-the-art Review
计算机科学, 2013, 40(10): 1-9.
[9] 张 磊,陶彬贤,钱 巨.
一种利用指向组合优化依赖图构建的方法
Using Points-to Combinations to Optimize Dependence Graph Construction
计算机科学, 2013, 40(1): 139-143.
[10] 况晓辉,赵刚,温研,许飞,苗青.
大规模分布式系统脆弱性分析框架研究
Research on Vulnerability Analysis Framework for Large-scale Distributed System
计算机科学, 2012, 39(6): 58-60.
[11] 姜淑娟,赵雪峰.
基于变量作用域的数据流分析
Data Flow Analysis Based on the Variable Scope
计算机科学, 2012, 39(3): 131-134.
[12] 陈锋,毛捍东,张维明,雷长海.
攻击图技术研究进展
Survey of Attack Graph Technique
计算机科学, 2011, 38(11): 12-18.
[13] 杨飏,张焕国,王后珍.
一种C程序内存访问缺陷自动化检测方法研究
Full-automatic Detection of Memory Safety Violations for C Programs
计算机科学, 2010, 37(6): 155-158.
[14] 王春雷,刘强,赵刚,戴一奇.
一种基于模型检测的二进制程序脆弱性分析框架
Vulnerability Analysis Framework for Binaries Based on Model Checking
计算机科学, 2010, 37(4): 120-.
[15] 易晓东 杨学军.
一种C程序断言的全自动静态验证方法

计算机科学, 2006, 33(9): 253-256.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!