基于申威编译器的并行调度策略优化技术研究

doi:10.11896/jsjkx.241200072

Computer Science ›› 2025, Vol. 52 ›› Issue (9): 137-143.doi: 10.11896/jsjkx.241200072

• High Performance Computing • Previous Articles Next Articles

Research on Parallel Scheduling Strategy Optimization Technology Based on Sunway Compiler

XU Jinlong^1,3, WANG Gengwu^1,2, HAN Lin¹, NIE Kai¹, LI Haoran¹, CHEN Mengyao^1,2, LIU Haohao^1,2

1 National Supercomputing Center in Zhengzhou,Zhengzhou University,Zhengzhou 450001,China
2 School of Computer and Artificial Intelligence,Zhengzhou University,Zhengzhou 450001,China
3 Information Engineering University,Zhengzhou 450001,China

Received:2024-12-10 Revised:2025-05-15 Online:2025-09-15 Published:2025-09-11
About author:XU Jinlong,born in 1985,Ph.D,lectu-rer,master’s supervisor.His main research interests include high-perfor-mance computing and parallel compilation.
NIE Kai,born in 1987,Ph.D,lecturer,postgraduate supervisor.His main research interests include advanced compilation techniques,high-performance computing,etc.
Supported by:
2024 Henan Provincial Major Science and Technology Project(241100210100),2024 Henan Provincial Science and Technology Research Project(242102211094),2023 National Key R&D Program for High-Performance Computing(2023YFB3002505) and 2022 Henan Provincial Major Science and Technology Project(221100210600).

Abstract

Abstract: Scheduling strategies are an important part of compiler parallelization,ensuring load balancing on multi-core processors.However,the default static scheduling used by the Sunway GCC compiler divides loop iterations statically,causing load imbalance in irregular loop structures,which impacts the performance of parallel programs on the Sunway platform.To address this problem,the proposed method combines trapezoid scheduling strategy,balancing scheduling overhead and load balancing,to improve the existing scheduling strategy of Sunway GCC.This strategy tested on the SW3231 processor using 844 parallel test cases from the GCC compiler test suite,and performance tested on the SPEC OMP 2012 benchmark and four typical loop types,shows a performance improvement of up to 1.10 and 4.54 compared to the three standard scheduling strategies in Sunway GCC.This method enhances thread-level parallelism in scientific computing programs,providing valuable insights for parallel compilation on the Sunway processor platform.

Key words: OpenMP scheduling strategy, Load balancing, Trapezoid self-scheduling, Scheduling overhead, Sunway GCC

CLC Number:

TP314

XU Jinlong, WANG Gengwu, HAN Lin, NIE Kai, LI Haoran, CHEN Mengyao, LIU Haohao. Research on Parallel Scheduling Strategy Optimization Technology Based on Sunway Compiler[J].Computer Science, 2025, 52(9): 137-143.

References

[1]AREZOO A,SHAHRIAR L,HABIB I.TEA-SEA:Tiling andscheduling of non-uniform two-level perfectly nested loops using an evolutionary approach[J].Expert Systems with Applications,2022,191:1-21.
[2]JIN H,JESPERSEN D,MEHROTRA P,et al.High performancecomputing using MPI and OpenMP on multi-core parallel systems[J].Parallel Computing,2011,37(9):562-575.
[3]MAC Y,LU B X,YE X J,et al.Automatic parallelization framework for complex nested loops based on LLVM Pass[J].Journal of Software,2023,34(7):3022-3042.
[4]DIMAKOPOULOSV V,LEONTIADIS E,TZOUMAS G.Aportable C compiler for OpenMP V.2.0[C]//Proceedings of EWOMP.2003:5-11.
[5]LIU S F,ZHANG Y Q,SUN X Z.Research on an improvedOpenMP guided scheduling strategy[J].Journal of Computer Research and Development,2010,47(4):687-694.
[6]FAN H M,LI Z T.Multi-thread load balancing scheduling stra-tegy based on OpenMP[J].Computer and Modernization,2013(12):192-195.
[7]HUMMELS F,SCHONBERG E,FLYNN L E.Factoring:Amethod for scheduling parallel loops[J].Communications of the ACM,1992,35(8):90-101.
[8]LI Y P,PANG J M,XU J L,et al.A Nonlinear Static SchedulingStrategy for Linear Loop Structure[J].Computer Engineering,2022,48(1):155-162.
[9]YANGC D,ZHANG S Q.A parallel loop self-scheduling on extremely heterogeneous PC clusters[J].Journal of Information Science and Engineering,2004,20(2):263-273.
[10]BAK S,GUO Y,BALAJI P,et al.Optimized execution of parallel loops via user-defined scheduling policies[C]//Proceedings of the 48th International Conference on Parallel Processing.2019:1-10.
[11]BALACHANDRAN S.Compiler Enhanced Scheduling for OpenMPfor Heterogeneous Multiprocessors[J].arXiv:1808.06074,2018.
[12]TZENT H,NI L M.Trapezoid self-scheduling:A practicalscheduling scheme for parallel compilers[J].IEEE Transactions on Parallel and Distributed Systems,1993,4(1):87-98.
[13]PARK I,VOSS M J,KIM S W,et al.Parallel programming environment for OpenMP[J].Scientific Programming,2001,9(2/3):143-161.
[14]GNU Offloading and Multi Processing Runtime Library[EB/OL].https://gcc.gnu.org/onlinedocs/libgomp.pdf.
[15]LI H,TANDRI S,STUMM M,et al.Locality and loop scheduling on NUMA multiprocessors[C]//1993 International Confe-rence on Parallel Processing-ICPP’93.IEEE,1993:140-147.
[16]HOU B X,CHEN L.Research Overview of Database Technology Development[J].Software Guide,2024,23(6):214-220.
[17]SUGIURAK,NISHIMURA M,ISHIKAWA Y.Practical Persistent Multi-Word Compare-and-Swap Algorithms for Many-Core CPUs[J].arXiv:2404.01710,2024.
[18]GAO L,ZHAO Y C,ZHANG W G,et al.Survey on Thread Synchronization in GPU Parallel Programming[J].Journal of Software,2024,35(2):1028-1047.

Related Articles 15

[1]	ZHOU Kai, WANG Kai, ZHU Yuhang, PU Liming, LIU Shuxin, ZHOU Deqiang. Customized Container Scheduling Strategy Based on GMM [J]. Computer Science, 2025, 52(6): 346-354.
[2]	HUANG Chenxi, LI Jiahui, YAN Hui, ZHONG Ying, LU Yutong. Investigation on Load Balancing Strategies for Lattice Boltzmann Method with Local Grid Refinement [J]. Computer Science, 2025, 52(5): 101-108.
[3]	ZHENG Longhai, XIAO Bohuai, YAO Zewei, CHEN Xing, MO Yuchang. Graph Reinforcement Learning Based Multi-edge Cooperative Load Balancing Method [J]. Computer Science, 2025, 52(3): 338-348.
[4]	WANG Yijie, GAO Guoju, SUN Yu'e, HUANG He. Flow Cardinality Estimation Method Based on Distributed Sketch in SDN [J]. Computer Science, 2025, 52(2): 268-278.
[5]	LIAO Qihua, NIE Kai, HAN Lin, CHEN Mengyao, XIE Wenbing. Tile Selection Algorithm Based on Data Locality [J]. Computer Science, 2024, 51(12): 100-109.
[6]	YANG Zheming, ZUO Lulu, JI Wen. Joint Optimization Method for Node Deployment and Resource Allocation Based on End-EdgeCollaboration [J]. Computer Science, 2024, 51(11A): 240200010-7.
[7]	FU Xiong, FANG Lei, WANG Junchang. Edge Server Placement for Energy Consumption and Load Balancing [J]. Computer Science, 2023, 50(6A): 220300088-5.
[8]	XIE Haoshan, LIU Xiaonan, ZHAO Chenyan, LIU Zhengyu. Simulation Implementation of HHL Algorithm Based on Songshan Supercomputer System [J]. Computer Science, 2023, 50(6): 74-80.
[9]	YANG Qianlong, JIANG Lingyun. Study on Load Balancing Algorithm of Microservices Based on Machine Learning [J]. Computer Science, 2023, 50(5): 313-321.
[10]	CHEN Ziqiang, XIA Zhengyou. Failure Recovery Model for Single Link with Congestion-Avoidance in SDN [J]. Computer Science, 2023, 50(4): 212-219.
[11]	WENG Jie, LIN Bing, CHEN Xing. Multi-edge Server Load Balancing Strategy Based on Game Theory [J]. Computer Science, 2023, 50(11A): 221200150-8.
[12]	YUAN Peiyan, MA Yiwen. Optimal Edge Server Placement Method Based on Delay and Load [J]. Computer Science, 2023, 50(11A): 220900260-8.
[13]	GUO Yingya, WANG Lijuan, GENG Haijun. Edge Server Placement Algorithm Based on Spectral Clustering [J]. Computer Science, 2023, 50(10): 248-257.
[14]	TIAN Zhen-zhen, JIANG Wei, ZHENG Bing-xu, MENG Li-min. Load Balancing Optimization Scheduling Algorithm Based on Server Cluster [J]. Computer Science, 2022, 49(6A): 639-644.
[15]	GAO Jie, LIU Sha, HUANG Ze-qiang, ZHENG Tian-yu, LIU Xin, QI Feng-bin. Deep Neural Network Operator Acceleration Library Optimization Based on Domestic Many-core Processor [J]. Computer Science, 2022, 49(5): 355-362.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Research on Parallel Scheduling Strategy Optimization Technology Based on Sunway Compiler

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0