计算机科学 ›› 2025, Vol. 52 ›› Issue (6): 44-51.doi: 10.11896/jsjkx.240300166
姜军, 翟彦河, 曾志恒, 顾轶超, 黄亮明
JIANG Jun, ZHAI Yanhe, ZENG Zhiheng, GU Yichao, HUANG Liangming
摘要: 循环不变量外提算法是一种针对程序中循环结构的常用编译优化算法,其通过将循环体中的不变计算移动到循环外部来减少重复计算的开销,从而提高程序运行的速度。但在LLVM编译器中,传统的循环不变量外提算法会将全部循环不变量外提到循环体外部,当循环不变量达到一定数量时会导致寄存器溢出,在循环内引入额外的访存代价,对循环产生负优化效果。针对上述问题,在传统LLVM循环不变量外提算法的基础上,引入了一种循环代价分析算法,通过计算循环不变量在循环体中的运行代价和外提操作可能带来的溢出代价,评估其外提可能带来的收益,只对产生正收益的循环不变量进行外提,在有效减少循环体内重复计算的同时,规避引入额外开销的风险。在国产申威831处理器平台,使用典型用例进行优化效果测评,在千万级循环下,相较于传统循环不变量优化算法,提出的新优化算法具有17%以上的性能提升;使用SPEC CPU2017基准测试集(SPECspeed 2017 Interger套件)、Perl解释器DKbench基准测试集、Python解释器pyperformance基准测试集进行综合优化效果测评,结果表明,相较于传统循环不变量优化算法,提出的新优化算法分别具有0.4%,0.63%和1%的性能提升。
中图分类号:
[1]LATTNER C,ADVE V.LLVM:A compilation framework for lifelong program analysis & transformation[C]//International Symposium on Code Generation and Optimization.IEEE,2004:75-86. [2]CHO J,CHO D,KIM Y.Study on LLVM application in ParallelComputing System[J].The Journal of the Convergence on Culture Technology,2019,5(1):395-399. [3]WANG L,GAO K,ZHAO Y Q,et al.Deep Learning Compiler Load Balancing Optimization Method for Model Training[J].Journal of Frontiers of Computer Science and Technology,2024,18(1):111-126. [4]ZHU X L.The implementation and optimization of deep learningcompiler based on GPDSP[D].Changsha:National University of Defense Technology,2021. [5]XIE H C.An Auto Performance Predict Research for Scientific Program Based on LLVM[D].Harbin:Harbin Institute of Technology,2016. [6]PEELER H,LI S S,SLOSS A N,et al.Optimizing LLVM pass sequences with Shackleton:a linear genetic programming framework[C]//Proceedings of the Genetic and Evolutionary Computation Conference Companion.2022:578-581. [7]CHAI Y,CHEN M,LI J,et al.Implementation and optimization of data prefetching algorithm based on LLVM compilation system[C]//6th International Conference on Electronic Technology and Information Science.2021. [8]HUANG Z F,SHANG J D.Optimization based on LLVM globalinstruction selection[C]//2021 International Conference on Computer Network Security and Software Engineering.2021.2021. [9]AHO A V,SETHI R,ULLMAN J D.Compilers:principles,techniques,and tools[M].Reading:Addison-Wesley,1986. [10]ZAKOWSKI Y,BECK C,YOON I,et al.Modular,composi-tional,and executable formal semantics for LLVM IR[J].Pro-ceedings of the ACM on Programming Languages,2021,5(ICFP):1-30. [11]ANDREW W A.SSA is Functional Programming[J].Acm Sigplan Notices,1998,33(4):17-20. [12]YAN Y,PAN Z,YU L,et al.Research on the influencing factors of LLVM IR optimization effect[C]//2023 IEEE 3rd International Conference on Information Technology,Big Data and Artificial Intelligence(ICIBA).IEEE,2023,3:756-763. [13]SHANG J,XU K,HAN L,et al.Optimization of Access Ad-dress Calculation for LLVM[C]//2023 4th International Conference on Information Science,Parallel and Distributed Systems(ISPDS).IEEE,2023:458-464. [14]RODRIGUEZ-CANCIO M,COMBEMALE B,BAUDRY B.Automatic microbenchmark generation to prevent dead code elimination and constant folding[C]//Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering.2016:132-143. [15]GUO Z H,WU Y X,AN L F,et al.Study on function inlining optimization technologies based on LLVM[J].Computer Engineering and Applications,2017,53(3):41-46. [16]WANG C X,HAN L,LIU H H.Optimization of loop unrolling based on instruction Cache and register pressure[J].Computer Engineering & Science,2022,44(12):2111-2119. [17]HU Y X,ZHEGN Q L.Research on Deep Learning Based Loop Auto-schedule[J].Journal of Chinese Computer Systems,2024,45(7):1770-1777. [18]MELLOR-CRUMMEY J,ADVE V.Simplifying control flow in compiler-generated parallel code[J].International Journal of Parallel Programming,1998,26(5):613-638. [19]LI H X,LIU J.Methods of Data Flow Analysis[J].ComputerEngineering and Applications,2003,39(13):142-144. [20]GONG L Q,SHEN L,ZHOU Q L,et al.Research and Implementation of Compile Lock Mechanism Based on LLVM[J].Computer Application and Software,2021,38(11):11-17. [21]EBNER D,BRANDNER F,SCHOLZ B,et al.Generalized in-struction selection using SSA-graphs[C]//Proceedings of the 2008 ACM SIGPLAN-SIGBED Conference on Languages,Compilers,and Tools for Embedded Systems.2008:31-40. [22]LOZANO R C,CARLSSON M,DREJHAMMAR F,et al.Constraint-based register allocation and instruction scheduling[C]//International Conference on Principles and Practice of Constraint Programming.Berlin,Heidelberg:Springer Berlin Heidelberg,2012:750-766. [23]DE SOUZA XAVIER T C,OLIVEIRA G S,DE LIMA E D,et al.A detailed analysis of the llvm's register allocators[C]//2012 31st International Conference of the Chilean Computer Science Society.IEEE,2012:190-198. [24]LIANG J L,HUA B J,LV Y S,et al.Loop Invariant Code Motion Algorithm for Deep Learning Operators[J].Journal of Frontiers of Computer Science and Technology,2023,17(1):127-139. [25]SHI H K,WANG Z S,ZHANG S Z,et al.Performance Evaluation Benchmark of General-Purpose CPU:A Survey[J].Acta Electronica Sinica,2023,51(1):246-256. |
|