计算机科学 ›› 2022, Vol. 49 ›› Issue (9): 326-332.doi: 10.11896/jsjkx.220200163

• 信息安全 • 上一篇    下一篇

基于数据流特征的比较类函数识别方法

胡安祥, 尹小康, 朱肖雅, 刘胜利   

  1. 数学工程与先进计算国家重点实验室(信息工程大学) 郑州 450001
  • 收稿日期:2022-02-25 修回日期:2022-06-03 出版日期:2022-09-15 发布日期:2022-09-09
  • 通讯作者: 刘胜利(mr_shengliliu@163.com)
  • 作者简介:(hasafox@163.com)
  • 基金资助:
    科技委基础加强重点项目(2019-JCJQ-ZD-113)

Strcmp-like Function Identification Method Based on Data Flow Feature Matching

HU An-xiang, YIN Xiao-kang, ZHU Xiao-ya, LIU Sheng-li   

  1. State Key Laboratory of Mathematical Engineering and Advanced Computing,Zhengzhou 450001,China
  • Received:2022-02-25 Revised:2022-06-03 Online:2022-09-15 Published:2022-09-09
  • About author:HU An-xiang,born in 1996,master.His main research interests include cyber security and embedded network device.
    LIU Sheng-li,born in 1973,Ph.D,professor.His main research interests include network device security and network attack detection.
  • Supported by:
    Foundation Strengthening Key Project of Science & Technology Commission(2019-JCJQ-ZD-113).

摘要: 嵌入式设备已经随处可见,它们常常出现在安全领域的关键位置和靠近终端的隐私场所。然而,最近的研究表明,很多嵌入式设备存在后门,发现最多的为硬编码后门(即口令后门)。在口令后门的触发过程中,字符串比较函数(比较类函数)是不可或缺的,其重要性不言而喻。目前,针对比较类函数的识别主要借助于函数签名和控制流特征的匹配,前者不适用于对未知的比较类函数进行识别,并且受编译环境的影响较大,后者具有较高的误报率和漏报率。针对上述问题,提出了一种新颖的比较类函数识别方法CMPSeek。该方法在函数控制流的基础上,对比较类函数的数据流特征进行分析并构建了识别模型,用于对二进制程序中比较类函数的识别,并且适用于剥离的二进制程序(Stripped Binary)。此外,将二进制代码转换为中间语言VEX IR指令,以支持ARM,MIPS,PowerPC(PPC)和x86/64指令集。实验结果表明,当缺少源码、函数名等信息时,相比FLIRT和SaTC,CMPSeek在精准率和召回率上都有着更好的结果。

关键词: 函数识别, 数据流分析, 特征匹配, 字符串比较函数, 二进制程序

Abstract: Embedded devices have become visible everywhere,and they are used in a range of security-critical and privacy-sensitive applications.However,recent studies show that many embedded devices have backdoor,of which hard-coded backdoor(password backdoor) is the most common.In the triggering process of password backdoor,strcmp-like functions are necessary and important absolutely.However,the current identification of strcmp-like functions mainly relies on function signature and control flow feature matching.The former can't recognize user-defined strcmp-like functions,and the identify effect is greatly affected by the compile environment.The latter has high false positive rate and false negative rate.To solve the above problems,this paper proposes a novel strcmp-like recognition technology CMPSeek.This method builds a model for strcmp-like function identification based on the analysis of control flow and data flow characteristics,which is used to identify strcmp-like functions in binary programs,and is suitable for stripped binary programs.Furthermore,ARM,MIPS,PPC and x86/64 instruction sets are supported by converting binary codes to the intermediate language representation VEX IR codes.Experimental results show that CMPSeek has better results in accuracy rate and recall rate than FLIRT and SaTC in the absence of source code,function name and other information.

Key words: Function identification, Data flow analysis, Feature matching, Strcmp-like function, Binary program

中图分类号: 

  • TP393
[1]PIERLUIGI P.Netgear,Linksys and many other Wireless Rou-ters have a backdoor [EB/OL].(2014-01-04).http://securityaffairs.co/wordpress/20941/hacking/netgear-linkys-routers-back-door.html.
[2]MATHIEU S.Reverse Engineering a D-Link Backdoor[EB/OL].(2013-10-14).https://hackaday.com/2013/10/14/reverse-engineering-a-d-link-backdoor/.
[3]OPERATOR8203.SSH Backdoor for FortiGate OS Version 4.x up to 5.0.7[EB/OL].(2016-01-26).https://seclists.org/fulldisclosure/2016/Jan/26.
[4]ZYXEL N.Zyxel security advisory for hardcoded credential vulnerability|Zyxel[EB/OL].(2021-01-08).https://www.zyxel.com/support/CVE-2020-29583.shtml.
[5]HD M.CVE-2015-7755:Juniper ScreenOS Authentication Backdoor[EB/OL].(2015-12-20).https://www.rapid7.com/blog/post/2015/12/20/cve-2015-7755-juniper-screenos-authentication-backdoor/.
[6]IDA Pro.Hex-ray.IDA F.L.I.R.T.Technology:In-Depth[EB/OL].(2021-07-30).https://hex-rays.com/products/ida/tech/flirt/in_depth/.
[7]REDINI N,MACHIRY A,WANG R,et al.Karonte:Detecting insecure multi-binary interactions in embedded firmware[C]//Proceedings of 2020 IEEE Symposium on Security and Privacy(SP).Piscataway,NJ:IEEE,2020:1544-1561.
[8]CHEN L,WANG Y H,CAI Q P,et al.Sharing More and Che-cking Less:Leveraging Common Input Keywords to Detect Bugs in Embedded Systems[C]//Proceedings of 30th USENIX Secu-rity Symposium(USENIX Security 21).Berkeley,CA:USENIX Association,2021:303-319.
[9]SHOSHITAISHVILI Y,WANG R,HAUSER C,et al.Firma-lice-Automatic Detection of Authentication Bypass Vulnerabilities in Binary Firmware[C]//NDSS.2015:1.1-8.1.
[10]THOMAS S L,CHOTHIA T,GARCIA F D.Stringer:measu-ring the importance of static data comparisons to detect backdoors and undocumented functionality[C]//European Sympo-sium on Research in Computer Security.Cham:Springer,2017:513-531.
[11]YIN X K,LU B,CAN R J,et al.Memcpy-Like Function Identification Technique with Control Flow and Data Flow Analysis[J/OL].Journal of Computer Research and Developmen.[2022-06-14].http://kns.cnki.net/kcms/detail/11.1777.TP.20220304.1646.003.html.
[12]ANGR.Python bindings for Valgrind's VEX IR [EB/OL].(2021-05-23).https://github.com/angr/pyvex/.
[13]IDA Pro.Hex-ray.IDAPython documentation [EB/OL].(2021-07-30).https://www.hex-rays.com/wp-content/static/pro-ducts/ida/support/idapython_docs/.
[14]NETWORKX developers.Network analysis in python [EB/OL].https://networkx.org/.
[15]STACK O.Where Developers Learn,Share & Build Careers[EB/OL].https://stackoverflow.com/.
[1] 刘帅, 芮挺, 胡育成, 杨成松, 王东.
基于深度学习SuperGlue算法的单目视觉里程计
Monocular Visual Odometer Based on Deep Learning SuperGlue Algorithm
计算机科学, 2021, 48(8): 157-161. https://doi.org/10.11896/jsjkx.200700134
[2] 程松盛, 潘金山.
基于深度学习特征匹配的视频超分辨率方法
Video Super-resolution Method Based on Deep Learning Feature Warping
计算机科学, 2021, 48(7): 184-189. https://doi.org/10.11896/jsjkx.200800224
[3] 方磊, 武泽慧, 魏强.
二进制代码相似性检测技术综述
Summary of Binary Code Similarity Detection Techniques
计算机科学, 2021, 48(5): 1-8. https://doi.org/10.11896/jsjkx.200400085
[4] 高玉潼, 雷为民, 原玥.
复杂环境下基于聚类分析的人脸目标识别
Face Recognition Based on Cluster Analysis in Complex Environment
计算机科学, 2020, 47(7): 111-117. https://doi.org/10.11896/jsjkx.190500004
[5] 李月峰.
一种基于多特征结合的三维模型检索方法
3D Retrieval Algorithm Based on Multi-feature
计算机科学, 2019, 46(6A): 266-269.
[6] 郭威, 于建江, 汤克明, 徐涛.
动态数据流分析的在线超限学习算法综述
Survey of Online Sequential Extreme Learning Algorithms for Dynamic Data Stream Analysis
计算机科学, 2019, 46(4): 1-7. https://doi.org/10.11896/j.issn.1002-137X.2019.04.001
[7] 尹中旭, 张连成.
一种数据流相关过滤器自动插入的注入入侵避免方案
SQL Injection Intrusion Avoidance Scheme Based on Automatic Insertion of Dataflow-relevant Filters
计算机科学, 2019, 46(1): 201-205. https://doi.org/10.11896/j.issn.1002-137X.2019.01.031
[8] 刘朝霞,邵峰,景雨,祁瑞华.
基于视觉约束能量最小化的特征点匹配算法
Feature Matching Algorithm Based on Visual Feature Constrained Energy Minimization
计算机科学, 2018, 45(5): 228-231. https://doi.org/10.11896/j.issn.1002-137X.2018.05.039
[9] 唐佳林,郑杰锋,李熙莹,苏秉华.
航拍视频中运动目标检测算法研究
Research on Detecting Algorithm of Moving Target in Aerial Video
计算机科学, 2017, 44(Z11): 175-177. https://doi.org/10.11896/j.issn.1002-137X.2017.11A.036
[10] 刘红敏,李璐,王志衡.
基于采样点组二值化策略的鲁棒二值描述子研究
Sample Point Group Based Binary Method for Robust Binary Descriptor
计算机科学, 2017, 44(12): 292-297. https://doi.org/10.11896/j.issn.1002-137X.2017.12.053
[11] 张广梅,李景霞.
面向软件错误检测的数据流分析
Data-flow Analysis for Software Error Detection
计算机科学, 2016, 43(Z6): 497-501. https://doi.org/10.11896/j.issn.1002-137X.2016.6A.117
[12] 张秀峰,张真林,谢红.
掌纹ROI分割算法的研究与实现
Research and Realization of Palmprint ROI Segmentation Algorithm
计算机科学, 2016, 43(Z11): 170-173. https://doi.org/10.11896/j.issn.1002-137X.2016.11A.037
[13] 肖春宝,冯大政.
基于K近邻一致性的特征匹配内点选择算法
Inlier Selection Algorithm for Feature Matching Based on K Nearest Neighbor Consistency
计算机科学, 2016, 43(1): 290-293. https://doi.org/10.11896/j.issn.1002-137X.2016.01.062
[14] 曹晓初,金弟,鲁银涛,王宗仁,王启迪.
基于双边滤波的信号边界特征匹配与延拓
Signal Boundary Characteristic Matching Extension Based on Bilateral Filtering
计算机科学, 2015, 42(Z11): 301-304.
[15] 冬雨辰,王寒非,赵建华.
基于数据流分析的单链表可达性自动化验证
Automatic Verification of Singly Linked List Pointer’s Reachability Property Using Data-flow Analysis Method
计算机科学, 2015, 42(12): 47-51.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!