计算机科学 ›› 2026, Vol. 53 ›› Issue (2): 117-123.doi: 10.11896/jsjkx.250300153

• 计算机体系结构 • 上一篇    下一篇

基于条件语句半不变性分析的循环分裂优化

韩林1, 邵晶晶2, 聂凯1, 李浩然1, 刘浩浩1, 陈梦尧2   

  1. 1 郑州大学国家超级计算郑州中心 郑州 450001
    2 郑州大学计算机与人工智能学院 郑州 450001
  • 收稿日期:2025-03-28 修回日期:2025-06-24 发布日期:2026-02-10
  • 通讯作者: 聂凯(ieknie@zzu.edu.cn)
  • 作者简介:(hanlin@zzu.edu.cn)
  • 基金资助:
    河南省重大科技专项(241100210100,221100210600);河南省科技攻关项目(242102211094);国家重点研发计划高性能计算专项(2023YFB3002505)

Loop Splitting Based on Conditional Statement Invariance Analysis

HAN Lin1, SHAO Jingjing2, NIE Kai1, LI Haoran1, LIU Haohao1, CHEN Mengyao2   

  1. 1 National Supercomputing Center in Zhengzhou,Zhengzhou University,Zhengzhou 450001,China
    2 College of Computer and Artificial Intelligence,Zhengzhou University,Zhengzhou 450001,China
  • Received:2025-03-28 Revised:2025-06-24 Online:2026-02-10
  • About author:HAN Lin,born in 1978,Ph.D,professor,Ph.D supervisor,is a senior member of CCF(No.16416M).His main research interests include high-performance computing,advanced compilation techniques,program optimization and indigenous innovation of key technologies.
    NIE Kai,born in 1997,Ph.D,lecturer,master’s supervisor.His main research interests include advanced compilation technologies and high-performance compu-ting.
  • Supported by:
    Major Science and Technology Special Projects in Henan Province(241100210100,221100210600),Key Projects of Science and Technology of Henan Province(242102211094) and National Key Research and Development Program of China(2023YFB3002505).

摘要: 循环分裂是一种重要的编译优化技术,可有效减少循环控制开销、提升指令流水线效率,并为后续优化创造机会。针对GCC编译器现有循环分裂策略适用范围受限的问题,提出了一种改进的循环分裂优化算法,该算法基于静态单赋值形式,根据循环内条件变量所在PHI节点的位置,将其区分为循环头PHI节点条件变量和合并PHI节点条件变量,针对这两种条件变量,算法分别进行半不变性分析,并以此选择分裂点,从而实现更通用的循环分裂优化。在申威GCC编译器中实现了该算法。在申威新一代处理器平台上的实验结果显示,相比原有的循环分裂算法,该算法使SPEC CPU 2006测试集中的470.lbm测试程序性能提升8.8%,SPEC CPU 2017测试集中的620.omnetpp_s测试程序性能提升4.3%。所提方法扩展了可优化循环结构的范围,提升了申威GCC编译器的优化效率,可助力国产申威平台的基础软件生态建设。

关键词: GCC编译器, 循环优化, 循环分裂, 静态单赋值, 控制依赖

Abstract: Loop splitting is a key compiler optimization technique that can reduce control overhead,improve instruction pipeline efficiency,and create opportunities for subsequent optimization.To address the limited applicability of the existing loop splitting strategy in GCC compiler,an improved algorithm is proposed based on static single assignment form.It categorizes condition variables by their positions in PHI nodes into loop-header PHI node condition variables and merge PHI node condition variables.For these two types of condition variables,the algorithm performs semi-invariance analysis separately and selects partitioning points accordingly,thereby achieving more general loop splitting optimization.The algorithm is implemented in the Sunway GCCcompi-ler.Experimental results on the new-generation Sunway processor platform show that,compared to the original algorithm,the proposed algorithm improves the performance of the 470.lbm benchmark in SPEC CPU 2006 test suite by 8.8%,and the 620.omnetpp_s benchmark in SPEC CPU 2017 test suite by 4.3%.This method expands the scope of optimizable loop structures,improves the optimization efficiency of the Sunway GCC compiler,and can provide assistance for the basic software ecosystem construction of the domestic Sunway platform.

Key words: GCC compiler, Loop optimization, Loop splitting, Static single assignment, Control dependence

中图分类号: 

  • TP311
[1]SENGUPTA A,BHADAURIA S.Intellectual property coreprotection of control data flow graphs using robust watermar-king during behavioural synthesis based on user resource constraint and loop unrolling factor[J].Electronics Letters,2016,52(6):439-441.
[2]ZIRAKSIMA M,LOTFI S,LZADKH H.Using an evolutionary approach based on shortest common supersequence problem for loop fusion[J].Soft Computing,2019,24(10):1-22.
[3]BIELECKI W,PALKOWSKI M.Space-time loop tiling for dynamic programming codes[J].Electronics,2021,10(18):2233.
[4]GAO W,XU J L,SUN H H,et al.Research on loop optimization techniques for SIMD vectorization[J].Journal of Information Engineering University,2016,17(4):496-503.
[5]TSIRAMUA S,MELADZE H,DAVITASHVILI T,et al.Stru-ctural Analysis of Multi-Core Processor and Reliability Evaluation Model[J].Mathematics,2025,13(3):515.
[6]YANG X C,JIANG J,MA X D,et al.Program Slicing Technique Based on GCC Key Variable Data Flow Analysis Algorithm[J].Computer Engineering and Applications,2017,53(24):40-54.
[7]WANG S D,YIN W J,DONG Y K,et al.Data Flow Analysis for Sequential Storage Structures[J].Journal of Software,2020,31(5):1276-1293.
[8]GAO W,LI Y Y,SUN H H,et al.An Improved Control Flow SIMD Vectorization Method[J].Journal of Software,2017,28(8):2046-2063.
[9]SWEET Y,CHAUDHARY P.An efficient branch predictor for improved accuracy of instruction level parallelism[J].The Journal of Supercomputing,2021,77(10):1-23.
[10]ROGERS S,SLYCORD J,RAHEJA R,et al.Scalable LLVM-Based Accelerator Modeling in gem5[J].Computer Architecture Letters,2019,18(1):18-21.
[11]JEONG S J.A loop splitting method for single loops with non-uniform dependences[J].Journal in computer virology:An independant journal dedicated to computer viral and antiviral technologies,2014,10(2):137-143.
[12]LIU J,WICKERSON J,CONSTANTINIDES G.Loop splitting for efficient pipelining in high-level synthesis[C]//IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines(FCCM).IEEE,2016:72-79.
[13]XU R F,ZHAO Z Y,SHENG W G,et al.An Imperfect LoopMapping Method for Coarse-Grained Reconfigurable Architec-tures[J].Microelectronics & Computer,2018,35(7):50-57.
[14]YANG X.Advanced Loop Optimization Based on Tree-SSAFramework[J].Computer Knowledge and Technology,2009,5(24):7035-7037.
[15]DONG Y S,LI C J,XU Y.Implementation and Evaluation of Loop Array Prefetching Optimization in GCC Compiler[J].Computer Engineering and Applications,2016,52(6):19-25.
[16]BERG D V J,HAGA H.Matching Source Code Using Abstract Syntax Trees in Version Control Systems[J].Journal of Software Engineering and Applications,2018,11(6):318-340.
[17]COLOMBET Q,BRANDNER F,DARTE A.Studying Optimal Spilling in the Light of SSA[J].ACM Transactions on Architecture and Code Optimization(TACO),2015,11(4):1-26.
[18]CHEN M Y.Research on Loop Distribution Techniques forSunway GCC Compilation System[D].Zhengzhou:Zhengzhou University,2021.
[19]HAN L,XU J L,LI Y Y,et al.Loop Distribution and Fusion Optimization for Partial Vectorization[J].Computer Science,2017,44(2):70-81.
[20]GAO W,HAN L,ZHAO R C,et al.Vector Parallelism-Guided Loop SIMD Vectorization Method[J].Journal of Software,2017,28(4):925-939.
[21]LIM I H.An approach to comparing control clow graphs based on basic block matching[J].Indian Journal of Computer Science and Engineering,2020,11(3):289-296.
[22]CYTRON R,FERRANTE J,ROSEN B K,et al.Efficientlycomputing static single assignment form and the control dependence graph[J].ACM Transactions on Programming Languages and Systems,1991,13(4):451-490.
[23]XIE X F,CHEN B H,ZOU L,et al.Automatic Loop Summarization via Path Dependency Analysis[J].IEEE Transactions on Software Engineering,2019,45(6):537-557.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!