计算机科学 ›› 2026, Vol. 53 ›› Issue (3): 433-442.doi: 10.11896/jsjkx.250400026

• 信息安全 • 上一篇    下一篇

基于数据分布匹配的单体式固件基地址识别方法

蔡瑞杰, 贾凡, 尹小康, 赵方方, 刘胜利   

  1. 信息工程大学网络空间安全教育部重点实验室 郑州 450001
  • 收稿日期:2025-04-07 修回日期:2025-05-24 发布日期:2026-03-12
  • 通讯作者: 赵方方(fangfangzhaowlc@163.com)
  • 作者简介:(wsxcrj@163.com)

Data Distribution Matching for Monolithic Firmware Base Address Identification

CAI Ruijie, JIA Fan, YIN Xiaokang, ZHAO Fangfang, LIU Shengli   

  1. Key Laboratory of Cyberspace Security, the Ministry of Education, Information Engineering University, Zhengzhou 450001, China
  • Received:2025-04-07 Revised:2025-05-24 Online:2026-03-12
  • About author:CAI Ruijie,born in 1990,Ph.Dcandidate,lecturer.His main research in-terests include networks security and vulnerability discovery.
    ZHAO Fangfang,born in 1990,Ph.D,lecturer.Her main research interests include networks security and network intrusion detection.

摘要: 单体式固件基地址识别是开展固件安全研究的基础。现有研究和相关工具存在识别率低、性能差和资源占用率高等问题。针对该问题,提出一种基于数据分布匹配的单体式固件基地址识别方法。该方法首先计算固件各部分的有效字符密度,并基于有效字符密度将固件划分为文本数据段和非文本数据段。从固件的文本数据段提取固件包含的字符串常量数据。通过对寄存器字装载指令进行识别和解析,提取固件所包含的绝对地址数据,并将这些绝对地址数据按照所传入函数及传入函数前所在寄存器的组合划分为多个绝对地址簇。最终通过将绝对地址簇和字符串常量数据在固件中的分布间隔相匹配来确定绝对地址数据和字符串常量数据的对应关系,从而实现基地址的求解。实验表明,基于数据分布匹配的固件基地址识别方法的识别效率远高于现有方法,对于由30个固件组成的测试集,所提方法的基地址识别成功率达到了100%。

关键词: 单体式固件, 基地址识别, 数据分布, 字符密度, 绝对地址簇

Abstract: Base address identification for monolithic firmware serves as the foundation for firmware security research.Current methods and related tools suffer from low identification rate,poor performance,and high resource consumption.To address these limitations,this paper proposes a new data distribution matching-based method for monolithic firmware base address identification.The method firstly calculates the effective character density of each firmware section,then partitions the firmware into text segments and non-text segments based on this density metric.Then,it comprehensively extracts string constant data from the text segments.By identifying and parsing register load instructions,this method retrieves absolute address data embedded in the firmware.These absolute addresses are clustered according to both their destination functions and the source registers used prior to function invocation.The base address is ultimately determined by matching the distribution patterns between the absolute address clusters and string constant data within the firmware,thereby establishing their correspondence and enabling accurate base address resolution.Experimental results demonstrate that the proposed data distribution matching-based firmware base address identification method significantly outperforms existing approaches in recognition efficiency,achieving a 100% success rate on a test set comprising 30 firmware samples.

Key words: Monolithic firmware, Base address identification, Data distribution, Character density, Absolute address cluster

中图分类号: 

  • TP391
[1]VAMBOL A,KHARCHENKO V,POTII O,et al.McEliece and Niederreiter Cryptosystems Analysis in the Context of Post-Quantum Network Security[C]//International Conference on Mathematics & Computers in Sciences & in Industry.IEEE Computer Society,2017:134-137.
[2]GUSTAFSON E,GROSEN P,REDINI N,et al.Shimware:Toward practical security retrofitting for monolithic firmware images[C]//Proceedings of the 26th International Symposium on Research in Attacks,Intrusions and Defenses.2023:32-45.
[3]SEIDEL L,MAIER D C,MUENCH M.Forming Faster Firm-ware Fuzzers[C]//32nd USENIX Security Symposium.2023:2903-2920.
[4]FARINA M D,POHREN D H,ROQUE A D S,et al.Hardware-independent embedded firmware architecture framework[J].Journal of Internet Services and Applications,2024,15(1):14-24.
[5]FARRELLY G,QUIRK P,KANHERE S,et al.SplITS:Split input-to-state map** for effective firmware fuzzing[C]//European Symposium on Research in Computer Security.Cham:Springer,2023:290-310.
[6]LEE Y,KIM J,YU J,et al.Embedded Firmware Rehosting System Through Automatic Peripheral Modeling[J].IEEE Access,2023,11:141343-141357.
[7]HUANG J,YANG K,WANG G,et al.TaiE:Function Identification for Monolithic Firmware[C]//Proceedings of the 32nd IEEE/ACM International Conference on Program Comprehension.2024:403-414.
[8]CAI R,ZHANG Z,ZHU X,et al.Coding style matters:Scalable and efficient identification of memory management functions in monolithic firmware[J].Journal of Systems and Software,2025,228:112472.
[9]HUANG J,YANG K,WANG G,et al.Moye:A Wallbreaker for Monolithic Firmware[C]//2025 IEEE/ACM 47th International Conference on Software Engineering(ICSE).2025:590-590.
[10]CHEN L,WANG Y,CAI Q,et al.Sharing more and checking less:Leveraging common input keywords to detect bugs in embedded systems[C]//Proceedings of the 30th USENIX Security Symposium.2021:303-319.
[11]CHENG K,ZHENG Y,LIU T,et al.Detecting Vulnerabilities in Linux-Based Embedded Firmware with SSE-Based On-Demand Alias Analysis[C]//Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis.2023:360-372.
[12]YIN X K,CAI R J,ZHU X Y,et al.Precise Discovery of More Taint-Style Vulnerabilities in Embedded Firmware[J].IEEE Transactions on Dependable and Secure Computing,2025,22(2):1365-1382.
[13]GIBBS W,RAJ A S,VADAYATH J M,et al.Operation Mango:Scalable Discovery of Taint-Style Vulnerabilities in Binary Firmware Services[C]//Proceedings of the 33rd USENIX Security Symposium.2024:7123-7139.
[14]CHESSER M,NEPAL S,RANASINGHE D C.{MultiFuzz}:A {Multi-Stream} Fuzzer For Testing Monolithic Firmware[C]//33rd USENIX Security Symposium.2024:5359-5376.
[15]CHEN L,CAI Q,MA Z,et al.Sfuzz:Slice-based fuzzing for real-time operating systems[C]//Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security.2022:485-498.
[16]GRITTI F,PAGANI F,GRISHCHENKO I,et al.Heapster:Analyzing the security of dynamic allocators for monolithic firmware images[C]//2022 IEEE Symposium on Security and Privacy.2022:1082-1099.
[17]ZHU R,TAN Y A,ZHANG Q,et al.Determining image base of firmware for ARM devices by matching literal pools[J].Digital Investigation,2016,16:19-28.
[18]ZHU R,ZHANG B,MAO J,et al.A methodology for determining the image base of ARM-based industrial control system firmware[J].International Journal of Critical Infrastructure Protection,2017,16:26-35.
[19]ZHU R,ZHANG B,TAN Y A,et al.Determining the ImageBase of ARM Firmware by Matching Function Addresses[J/OL].https://onlinelibrary.wiley.com/doi/epdf/10.1155/2021/4664882.
[20]ZHU X,ZHANG Y,JIANG L,et al.Determining the base address of MIPS firmware based on absolute address statistics and string reference matching[J].Computers & Security,2020,88:101504.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!