Computer Science ›› 2025, Vol. 52 ›› Issue (5): 76-82.doi: 10.11896/jsjkx.231200174

• High Performance Computing • Previous Articles     Next Articles

Research on Function Vectorization Technology Based on Directive Statements

LIU Lili1, SHAN Zheng1, LI Yingying1, WU Wenhao2, LIU Wenbo1   

  1. 1 PLA Strategic Support Force Information Engineering University,Zhengzhou 450001,China
    2 National Research Center of Parallel Computer Engineering Technology,Wuxi,Jiangsu 100190,China
  • Received:2023-12-25 Revised:2024-06-29 Online:2025-05-15 Published:2025-05-12
  • About author:LIU Lili,born in 1999,postgraduate.Her main research interests include automatic vectorization of compilers and so on.
    LI Yingying,born in 1984,Ph.D,associate professor,master supervisor.Her main research interests include high performance computing and advanced compilation techniques.
  • Supported by:
    2024 Laboratory for Advanced Computing and Intelligence Engineering(ACIE) Project.

Abstract: With the continuous development of processor technology,SIMD(Single Instruction Multiple Data) vectorization has been widely applied in various fields.However,previous research has mainly focused on loops and basic blocks,while full-function vectorization can better exploit the advantages of SIMD instructions,thereby improving application performance.This paper proposes a guided statement-based function vectorization method.Firstly,a relatively simple guided statement is added to the loop involving function calls,which can vectorize the instructions involving function calls in the loop.Secondly,the vectorization of the called function is achieved by using full function vectorization to generate a vectorized full function instead of inlining it.Finally,the function call instructions in the loop are processed to generate vectorized function call instructions.We selected 10 benchmarks from ISPC benchmark tests and SIMD library benchmark tests to evaluate our method,and the experimental results show that compared to scalar,the average speedup achieved is 6.949 times.

Key words: Function vectorization, SIMD, Automatic vectorization

CLC Number: 

  • TP312
[1]KANDIAH V,LUSTIG D,VILLA O,et al.Parsimony:Enabling SIMD/Vector Programming in Standard Compiler Flows[C]//Proceedings of the 21st ACM/IEEE International Symposium on Code Generation and Optimization.2023:186-198.
[2]MOLL S,HACK S.Partial control-flow linearization[J].ACM SIGPLAN Notices,2018,53(4):543-556.
[3]RAPAPORT G,ZAKS A,BEN-ASHER Y.Streamlining Whole Function Vectorization in C Using Higher Order Vector Semantics[C]//Parallel & Distributed Processing Symposium Workshop.IEEE,2015.
[4]TIAN X,SAITO H,GIRKAR M,et al.Compiling C/C++SIMD extensions for function and loop vectorizaion on multicore-SIMD processors[C]//2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum.IEEE,2012:2349-2358.
[5]MASTEN M,TYURIN E,MITROPOULOU K,et al.Func-tion/Kernel Vectorization via Loop Vectorizer[C]//Workshop on the LLVM Compiler Infrastructure in HPC.2018.
[6]KARRENBERG R.Whole-function vectorization[C]//Proceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization.2015:141-150.
[7]GAO W,ZHAO R,HAN L,et al.SIMD automatic vectorization summary of compiler optimization [J].Journal of Software,2015,26(6):1265-1284.
[8]FENG J,HE Y,TAO Q.Automatic vectorization,the recentprogress and future [J].Journal of communication,2022(3):43.
[9]LARSEN S,AMARASINGHE S.Exploiting superword levelparallelism with multimedia instruction sets[C]//Proceedings of the ACM SIGPLAN 2000 Conference on Programming Language Design and Implementation.New York:ACM Press,2000:145-156.
[10]PORPODAS V,ROCHA R C O,GÓES L F W.Look-aheadSLP:Auto-vectorization in the presence of commutative operations[C]//Proceedings of the 2018 International Symposium on Code Generation and Optimization.2018:163-174.
[11]PORPODAS V,ROCHA R C O,BREVNOV E,et al.Super-Node SLP:Optimized vectorization for code sequences containing operators and their inverse elements[C]//2019 IEEE/ACM International Symposium on Code Generation and Optimization(CGO).IEEE,2019:206-216.
[12]FENG J,HE Y,TAO Q,et al.An SLP Vectorization Method Based on Equivalent Extended Transformation[J/OL].https://onlinelibrary.wiley.com/doi/10.1155/2022/1832522.
[13]ALLEN R,KENNEDY K.Automatic translation of Fortranprograms to vector form[J].ACM Transactions on Programming Languages and Systems,1987,9(4):491-542.
[14]ALLEN R,KENNEDY K,PORTERFIELD C,et al.Conversion of control dependence to data dependence[C]//Proceedings of the 10th ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages.New York:ACM Press,1983:177-189.
[15]BIK A J.The Software Vectorization Handbook:Applying Multimedia Extensions for Maximum Performance[M].Intel Press,2004.
[16]HAMPTON M,ASANOVIC K.Compiling for vector-thread ar-chitectures[C]//Proceedings of the 6th Annual IEEE/ACM International Symposium on Code Generation and Optimization(CGO).2008:205-215.
[17]NUZMAN D.loop aware SLP in GCC[C]//GCC Developers Summit.2007.
[18]LI Y,GAO Y,WANG D,et al.Optimizations of the WholeFunction Vectorization Based on SIMD Characteristics[C]//Parallel Architecture,Algorithm and Programming:8th International Symposium(PAAP 2017).Haikou,China,Springer Singapore,2017:152-171.
[1] MO Shangfeng, ZHOU Zhenfen, HU Yonghua, XU Minmin, MAO Chunxian, YUAN Yudi. Transplantation and Optimization of Row-vector-matrix Multiplication in Complex Domain Based on FT-M7002 [J]. Computer Science, 2023, 50(11A): 220900277-6.
[2] YAO Jian-yu, ZHANG Yi-wei, ZHANG Guang-ting, JIA Hai-peng. High Performance Implementation and Optimization of Trigonometric Functions Based on SIMD [J]. Computer Science, 2021, 48(12): 29-35.
[3] LI Shuang, ZHAO Rong-cai, WANG Lei. Implementation and Optimization of Sunway1621 General Matrix Multiplication Algorithm [J]. Computer Science, 2021, 48(11A): 699-704.
[4] GONG Tong-yan,ZHANG Guang-ting,JIA Hai-peng,YUAN Liang. High-performance Implementation Method for Even Basis of Cooley-Tukey FFT [J]. Computer Science, 2020, 47(1): 31-39.
[5] ZHOU Bei, HUANG Yong-zhong, XU Jin-chen, GUO Shao-zhong. Study on SIMD Method of Vector Math Library [J]. Computer Science, 2019, 46(1): 320-324.
[6] JIN Xing-tong, LI Peng, WANG Gang, LIU Xiao-guang and LI Zhong-wei. Optimizing Small XOR-based Non-systematic Erasure Codes [J]. Computer Science, 2017, 44(6): 36-42.
[7] HAO Xin and GUO Shao-zhong. Optimization of 3D Finite Difference Algorithm on Intel MIC [J]. Computer Science, 2017, 44(5): 26-32.
[8] CHEN Yong and XU Chao. Symbolic Execution and Human-Machine Interaction Based Auto Vectorization Method [J]. Computer Science, 2016, 43(Z6): 461-466.
[9] YU Hai-ning, HAN Lin and LI Peng-yuan. Structure Optimization for Automatic Vectorization [J]. Computer Science, 2016, 43(2): 210-215.
[10] XU Jin-long ZHAO Rong-cai ZHAO Bo. Research on Non-full Length Usage of SIMD Vector Instruction [J]. Computer Science, 2015, 42(7): 229-233.
[11] SUN Hui-hui, ZHAO Rong-cai, GAO Wei and LI Yan-bing. Control Flow Vectorization Based on Conditions Classification [J]. Computer Science, 2015, 42(11): 240-247.
[12] XU Ying,LI Chun-jiang,DONG Yu-shan and ZHOU Si-qi. Implementation of Auto-vectorization Based on Directives in GCC [J]. Computer Science, 2014, 41(Z11): 364-367.
[13] LIU Peng,ZHAO Rong-cai,ZHAO Bo and GAO Wei. Unified Vectorization Framework for SIMD Extensions [J]. Computer Science, 2014, 41(9): 28-31.
[14] HOU Yong-sheng,ZHAO Rong-cai,HUANG Lei and HAN Lin. Research on SIMD-oriented Loop Optimizations [J]. Computer Science, 2014, 41(5): 27-32.
[15] ZHAO Bo,ZHAO Rong-cai,LI Yan-bing and GAO Wei. SLP Exploitation Method for Type Conversion Statements [J]. Computer Science, 2014, 41(11): 16-21.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!