Computer Science ›› 2025, Vol. 52 ›› Issue (6): 44-51.doi: 10.11896/jsjkx.240300166

• Computer Software • Previous Articles     Next Articles

Loop-invariant Code Motion Algorithm Based on Loop Cost Analysis

JIANG Jun, ZHAI Yanhe, ZENG Zhiheng, GU Yichao, HUANG Liangming   

  1. Wuxi Institute of Advanced Technology,Wuxi,Jiangsu 214122,China
  • Received:2024-03-26 Revised:2024-07-15 Online:2025-06-15 Published:2025-06-11
  • About author:JIANG Jun,born in 1980,master,senior engineer.His main research interests include compiler optimization,architecture-oriented performance analysis and optimization,etc.
    HUANG Liangming,born in 1988,Ph.D,engineer.His main research in-terests include compiler optimization,architecture-oriented performance ana-lysis and optimization,etc.
  • Supported by:
    Key Project of the Ministry of Science and Technology(GG20210701).

Abstract: Loop-invariant code motion(LICM) is a commonly used compilation optimization algorithm for loop structures in programs.By moving the invariant calculations in the loop body to outside the loop,the algorithm reduces the overhead of duplicate calculations,thus improving program execution speed.However,in LLVM compiler,the traditional LICM algorithm hoists all loop-invariants outside the loop body,which will lead to register overflow when the number of loop-invariant reaches a certain level.It will introduce additional memory access cost in the loop,resulting in a negative optimization effect on the loop.To address this issue,a loop cost analysis algorithm is introduced based on the traditional LLVM LICM algorithm.This algorithm evaluates the running cost of loop-invariant code inside the loop and the overflow cost that may be caused by moving the code outside the loop,and assesses the benefits of moving the code outside the loop.Only the loop-invariant code that produces positive benefits is moved outside the loop,effectively reducing the overhead of duplicate calculations in the loop while avoiding the risk of introducing additional costs.Theproposed new optimization algorithm achieves more than 17% performance improvement compared to the traditional LICM algorithm in typical use cases for the domestic SW831 processor platform under millions of loops.Comprehensive optimization effect evaluations are conducted using the SPEC CPU 2017 benchmark test suite(SPECspeed2017 Integer Suite),Perl interpreter DKbench benchmark test suite,and Python interpreter pyperformance benchmark test suite.The results show that compared with the traditional LICM algorithm,the proposed algorithm has 0.4%,0.63% and 1% performance improvement respectively.

Key words: LLVM compiler, Compilation optimization, Loop-invariant code motion, Register overflow, Loop cost analysis

CLC Number: 

  • TP314
[1]LATTNER C,ADVE V.LLVM:A compilation framework for lifelong program analysis & transformation[C]//International Symposium on Code Generation and Optimization.IEEE,2004:75-86.
[2]CHO J,CHO D,KIM Y.Study on LLVM application in ParallelComputing System[J].The Journal of the Convergence on Culture Technology,2019,5(1):395-399.
[3]WANG L,GAO K,ZHAO Y Q,et al.Deep Learning Compiler Load Balancing Optimization Method for Model Training[J].Journal of Frontiers of Computer Science and Technology,2024,18(1):111-126.
[4]ZHU X L.The implementation and optimization of deep learningcompiler based on GPDSP[D].Changsha:National University of Defense Technology,2021.
[5]XIE H C.An Auto Performance Predict Research for Scientific Program Based on LLVM[D].Harbin:Harbin Institute of Technology,2016.
[6]PEELER H,LI S S,SLOSS A N,et al.Optimizing LLVM pass sequences with Shackleton:a linear genetic programming framework[C]//Proceedings of the Genetic and Evolutionary Computation Conference Companion.2022:578-581.
[7]CHAI Y,CHEN M,LI J,et al.Implementation and optimization of data prefetching algorithm based on LLVM compilation system[C]//6th International Conference on Electronic Technology and Information Science.2021.
[8]HUANG Z F,SHANG J D.Optimization based on LLVM globalinstruction selection[C]//2021 International Conference on Computer Network Security and Software Engineering.2021.2021.
[9]AHO A V,SETHI R,ULLMAN J D.Compilers:principles,techniques,and tools[M].Reading:Addison-Wesley,1986.
[10]ZAKOWSKI Y,BECK C,YOON I,et al.Modular,composi-tional,and executable formal semantics for LLVM IR[J].Pro-ceedings of the ACM on Programming Languages,2021,5(ICFP):1-30.
[11]ANDREW W A.SSA is Functional Programming[J].Acm Sigplan Notices,1998,33(4):17-20.
[12]YAN Y,PAN Z,YU L,et al.Research on the influencing factors of LLVM IR optimization effect[C]//2023 IEEE 3rd International Conference on Information Technology,Big Data and Artificial Intelligence(ICIBA).IEEE,2023,3:756-763.
[13]SHANG J,XU K,HAN L,et al.Optimization of Access Ad-dress Calculation for LLVM[C]//2023 4th International Conference on Information Science,Parallel and Distributed Systems(ISPDS).IEEE,2023:458-464.
[14]RODRIGUEZ-CANCIO M,COMBEMALE B,BAUDRY B.Automatic microbenchmark generation to prevent dead code elimination and constant folding[C]//Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering.2016:132-143.
[15]GUO Z H,WU Y X,AN L F,et al.Study on function inlining optimization technologies based on LLVM[J].Computer Engineering and Applications,2017,53(3):41-46.
[16]WANG C X,HAN L,LIU H H.Optimization of loop unrolling based on instruction Cache and register pressure[J].Computer Engineering & Science,2022,44(12):2111-2119.
[17]HU Y X,ZHEGN Q L.Research on Deep Learning Based Loop Auto-schedule[J].Journal of Chinese Computer Systems,2024,45(7):1770-1777.
[18]MELLOR-CRUMMEY J,ADVE V.Simplifying control flow in compiler-generated parallel code[J].International Journal of Parallel Programming,1998,26(5):613-638.
[19]LI H X,LIU J.Methods of Data Flow Analysis[J].ComputerEngineering and Applications,2003,39(13):142-144.
[20]GONG L Q,SHEN L,ZHOU Q L,et al.Research and Implementation of Compile Lock Mechanism Based on LLVM[J].Computer Application and Software,2021,38(11):11-17.
[21]EBNER D,BRANDNER F,SCHOLZ B,et al.Generalized in-struction selection using SSA-graphs[C]//Proceedings of the 2008 ACM SIGPLAN-SIGBED Conference on Languages,Compilers,and Tools for Embedded Systems.2008:31-40.
[22]LOZANO R C,CARLSSON M,DREJHAMMAR F,et al.Constraint-based register allocation and instruction scheduling[C]//International Conference on Principles and Practice of Constraint Programming.Berlin,Heidelberg:Springer Berlin Heidelberg,2012:750-766.
[23]DE SOUZA XAVIER T C,OLIVEIRA G S,DE LIMA E D,et al.A detailed analysis of the llvm's register allocators[C]//2012 31st International Conference of the Chilean Computer Science Society.IEEE,2012:190-198.
[24]LIANG J L,HUA B J,LV Y S,et al.Loop Invariant Code Motion Algorithm for Deep Learning Operators[J].Journal of Frontiers of Computer Science and Technology,2023,17(1):127-139.
[25]SHI H K,WANG Z S,ZHANG S Z,et al.Performance Evaluation Benchmark of General-Purpose CPU:A Survey[J].Acta Electronica Sinica,2023,51(1):246-256.
[1] PEI Xue, WEI Shuai, SHAO Yangxue, YU Hong, GE Chenyang. Compilation Optimization and Implementation of High-order Cryptographic Operators on FPGA [J]. Computer Science, 2024, 51(11A): 231200184-11.
[2] FAN Lilin, QIAO Yihang, LI Junfei, CHAI Xuqing, CUI Rongpei, HAN Bingyu. CP2K Software Porting and Optimization Based on Domestic c86 Processor [J]. Computer Science, 2023, 50(6): 58-65.
[3] CHI Hao-yu, CHEN Chang-bo. Prediction of Loop Tiling Size Based on Neural Network [J]. Computer Science, 2020, 47(8): 62-70.
[4] . [J]. Computer Science, 2009, 36(3): 45-47.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!