Computer Science ›› 2026, Vol. 53 ›› Issue (3): 433-442.doi: 10.11896/jsjkx.250400026

• Information Security • Previous Articles     Next Articles

Data Distribution Matching for Monolithic Firmware Base Address Identification

CAI Ruijie, JIA Fan, YIN Xiaokang, ZHAO Fangfang, LIU Shengli   

  1. Key Laboratory of Cyberspace Security, the Ministry of Education, Information Engineering University, Zhengzhou 450001, China
  • Received:2025-04-07 Revised:2025-05-24 Published:2026-03-12
  • About author:CAI Ruijie,born in 1990,Ph.Dcandidate,lecturer.His main research in-terests include networks security and vulnerability discovery.
    ZHAO Fangfang,born in 1990,Ph.D,lecturer.Her main research interests include networks security and network intrusion detection.

Abstract: Base address identification for monolithic firmware serves as the foundation for firmware security research.Current methods and related tools suffer from low identification rate,poor performance,and high resource consumption.To address these limitations,this paper proposes a new data distribution matching-based method for monolithic firmware base address identification.The method firstly calculates the effective character density of each firmware section,then partitions the firmware into text segments and non-text segments based on this density metric.Then,it comprehensively extracts string constant data from the text segments.By identifying and parsing register load instructions,this method retrieves absolute address data embedded in the firmware.These absolute addresses are clustered according to both their destination functions and the source registers used prior to function invocation.The base address is ultimately determined by matching the distribution patterns between the absolute address clusters and string constant data within the firmware,thereby establishing their correspondence and enabling accurate base address resolution.Experimental results demonstrate that the proposed data distribution matching-based firmware base address identification method significantly outperforms existing approaches in recognition efficiency,achieving a 100% success rate on a test set comprising 30 firmware samples.

Key words: Monolithic firmware, Base address identification, Data distribution, Character density, Absolute address cluster

CLC Number: 

  • TP391
[1]VAMBOL A,KHARCHENKO V,POTII O,et al.McEliece and Niederreiter Cryptosystems Analysis in the Context of Post-Quantum Network Security[C]//International Conference on Mathematics & Computers in Sciences & in Industry.IEEE Computer Society,2017:134-137.
[2]GUSTAFSON E,GROSEN P,REDINI N,et al.Shimware:Toward practical security retrofitting for monolithic firmware images[C]//Proceedings of the 26th International Symposium on Research in Attacks,Intrusions and Defenses.2023:32-45.
[3]SEIDEL L,MAIER D C,MUENCH M.Forming Faster Firm-ware Fuzzers[C]//32nd USENIX Security Symposium.2023:2903-2920.
[4]FARINA M D,POHREN D H,ROQUE A D S,et al.Hardware-independent embedded firmware architecture framework[J].Journal of Internet Services and Applications,2024,15(1):14-24.
[5]FARRELLY G,QUIRK P,KANHERE S,et al.SplITS:Split input-to-state map** for effective firmware fuzzing[C]//European Symposium on Research in Computer Security.Cham:Springer,2023:290-310.
[6]LEE Y,KIM J,YU J,et al.Embedded Firmware Rehosting System Through Automatic Peripheral Modeling[J].IEEE Access,2023,11:141343-141357.
[7]HUANG J,YANG K,WANG G,et al.TaiE:Function Identification for Monolithic Firmware[C]//Proceedings of the 32nd IEEE/ACM International Conference on Program Comprehension.2024:403-414.
[8]CAI R,ZHANG Z,ZHU X,et al.Coding style matters:Scalable and efficient identification of memory management functions in monolithic firmware[J].Journal of Systems and Software,2025,228:112472.
[9]HUANG J,YANG K,WANG G,et al.Moye:A Wallbreaker for Monolithic Firmware[C]//2025 IEEE/ACM 47th International Conference on Software Engineering(ICSE).2025:590-590.
[10]CHEN L,WANG Y,CAI Q,et al.Sharing more and checking less:Leveraging common input keywords to detect bugs in embedded systems[C]//Proceedings of the 30th USENIX Security Symposium.2021:303-319.
[11]CHENG K,ZHENG Y,LIU T,et al.Detecting Vulnerabilities in Linux-Based Embedded Firmware with SSE-Based On-Demand Alias Analysis[C]//Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis.2023:360-372.
[12]YIN X K,CAI R J,ZHU X Y,et al.Precise Discovery of More Taint-Style Vulnerabilities in Embedded Firmware[J].IEEE Transactions on Dependable and Secure Computing,2025,22(2):1365-1382.
[13]GIBBS W,RAJ A S,VADAYATH J M,et al.Operation Mango:Scalable Discovery of Taint-Style Vulnerabilities in Binary Firmware Services[C]//Proceedings of the 33rd USENIX Security Symposium.2024:7123-7139.
[14]CHESSER M,NEPAL S,RANASINGHE D C.{MultiFuzz}:A {Multi-Stream} Fuzzer For Testing Monolithic Firmware[C]//33rd USENIX Security Symposium.2024:5359-5376.
[15]CHEN L,CAI Q,MA Z,et al.Sfuzz:Slice-based fuzzing for real-time operating systems[C]//Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security.2022:485-498.
[16]GRITTI F,PAGANI F,GRISHCHENKO I,et al.Heapster:Analyzing the security of dynamic allocators for monolithic firmware images[C]//2022 IEEE Symposium on Security and Privacy.2022:1082-1099.
[17]ZHU R,TAN Y A,ZHANG Q,et al.Determining image base of firmware for ARM devices by matching literal pools[J].Digital Investigation,2016,16:19-28.
[18]ZHU R,ZHANG B,MAO J,et al.A methodology for determining the image base of ARM-based industrial control system firmware[J].International Journal of Critical Infrastructure Protection,2017,16:26-35.
[19]ZHU R,ZHANG B,TAN Y A,et al.Determining the ImageBase of ARM Firmware by Matching Function Addresses[J/OL].https://onlinelibrary.wiley.com/doi/epdf/10.1155/2021/4664882.
[20]ZHU X,ZHANG Y,JIANG L,et al.Determining the base address of MIPS firmware based on absolute address statistics and string reference matching[J].Computers & Security,2020,88:101504.
[1] LI Renjie, YAN Qiao. Inter-cluster Optimization for Cluster Federated Learning [J]. Computer Science, 2023, 50(11A): 221000243-5.
[2] LU Chen-yang, DENG Su, MA Wu-bin, WU Ya-hui, ZHOU Hao-hao. Clustered Federated Learning Methods Based on DBSCAN Clustering [J]. Computer Science, 2022, 49(6A): 232-237.
[3] YE Yue-jin, LI Fang, CHEN De-xun, GUO Heng, CHEN Xin. Study on Preprocessing Algorithm for Partition Reconnection of Unstructured-grid Based on Domestic Many-core Architecture [J]. Computer Science, 2022, 49(6): 73-80.
[4] LI Bo-jia, ZHANG Yang-sen, CHEN Ruo-yu. Method for Generating Massive Data with Assignable Distribution [J]. Computer Science, 2019, 46(8): 56-63.
[5] ZHAO Nan, ZHANG Xiao-fang, ZHANG Li-jun. Overview of Imbalanced Data Classification [J]. Computer Science, 2018, 45(6A): 22-27.
[6] YANG Xin-lei, HE Qian, CAO Li and WANG Shi-cheng. P2P Based Massive Scalable Remote Sensing Data Distribution with Access Control [J]. Computer Science, 2017, 44(11): 268-272.
[7] CAO Wei, WANG Qiu-yue, QIN Xiong-pai and WANG Shan. HY-COCA:A Hybrid-data-distribution-aware Way to Detect Correlation over Bi-dimensional Data Space [J]. Computer Science, 2015, 42(6): 193-203.
[8] ZHANG Hua-wei and LI Zhi-hua. Research on Data Distribution Strategy in Cloud Storage System Based on Multi-objective Optimization [J]. Computer Science, 2015, 42(4): 44-50.
[9] DU Wen-feng and WU Zhen. Data Distribution Algorithm with Out-of-order Feedback for CMT over Diversity Network [J]. Computer Science, 2015, 42(3): 60-64.
[10] . Efficient and Scalable Parallel Algorithm for Motif Finding on Heterogeneous Cluster Systems [J]. Computer Science, 2012, 39(3): 279-282.
[11] . ILDM:Information Lifecycle Dynamic Management [J]. Computer Science, 2011, 38(12): 239-241.
[12] WANG Zhuo,FEND Xiao-ning,LIU Ting-bao. Region Matching Algorithm Based on Historical Information Sorting in DDM [J]. Computer Science, 2011, 38(10): 202-204.
[13] WEN Ming-bo,DING Zhi-ming. Selection Oriented Database Data Distribution Strategy for Cloud Computing [J]. Computer Science, 2010, 37(9): 168-172.
[14] PENG Xue-na,LI Jia,WEN Ying-you,ZHAO Hong. Recent Advances of Data Distribution Mechanisms for Live P2P Streaming [J]. Computer Science, 2010, 37(5): 15-20.
[15] ZHU Hong-peng, LI Guang-xia, FENG Shao-dong. BPML Decoding Algorithm of LT Codes [J]. Computer Science, 2009, 36(10): 77-81.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!