Computer Science ›› 2024, Vol. 51 ›› Issue (4): 78-85.doi: 10.11896/jsjkx.230200024

• High Performance Computing • Previous Articles     Next Articles

Auto-vectorization Cost Model Based on Instruction MKS

WANG Zhen1, NIE Kai2, HAN Lin2   

  1. 1 School of Computer and Artificial Intelligence,Zhengzhou University,Zhengzhou 450000,China
    2 National Supercomputing Center in Zhengzhou,Zhengzhou University,Zhengzhou 450000,China
  • Received:2023-02-04 Revised:2023-06-13 Online:2024-04-15 Published:2024-04-10
  • Supported by:
    Major Science and Technology Special Projects in Henan Province for 2022(221100210600) and 22 Qiushi Research Initiation(Natural Science)(32213247).

Abstract: The auto-vectorization cost model is an important component of compiler's auto-vectorization optimization.Its role is to evaluate whether the code can achieve performance improvement after applying vectorization transformation.When the cost model is inaccurate,the compiler will apply vectorization transformation with negative benefit,thus reducing the execution efficiency of the program.Aiming at the inaccuracy of the default cost model of GCC compiler,based on Intel Xeon Silver 4214R CPU,an auto-vectorization cost model based on instruction MKS is proposed.The model fully considers the machine mode,operation type and operation intensity of instructions,and uses gradient descent algorithm to automatically search the approximate cost of different instruction types.Single-thread tests are carried out on SPEC2006 and SPEC2017.Experimental results show that the model can reduce the error of benefit estimation.Compared with the vector program generated by the default cost model,the GCC compiler,after adding the MKS cost model,achieves a maximum speedup of 4.72% on the SPEC2006 benchmark and 7.08% on the SPEC2017 benchmark.

Key words: GCC compiler, Auto-vectorization, Cost model, Profit evaluation, Gradient descent

CLC Number: 

  • TP314
[1]JIN Z,LU Z H,LI H Y,et al.Origin of High Performance Compu-ting--Current Status and Developments of Scientific Computing Applications[J].Bulletin of Chinese Academy of Sciences,2019,34(6):625-639.
[2]RABENSEIFNER R,HAGER G,JOST G.Hybrid MPI/OpenMPparallel programming on clusters of multi-core SMP nodes[C]//2009 17th Euromicro International Conference on Parallel,Distributed and Network-based Processing.IEEE,2009:427-436.
[3]WENDE F,MARSMAN M,ZHAO Z,et al.Porting VASP from MPI to MPI+ OpenMP [SIMD][C]//International Workshop on OpenMP.Cham:Springer,2017:107-122.
[4]HUA Z,ZHANG K,LI Y,et al.Visually secure image encryption using adaptive-thresholding sparsification and parallel compressive sensing[J].Signal Processing,2021,183:107998.
[5]HAUTANIEMI S,LAAKSO M.High-performance computingin biomedicine[C]//2013 International Conference on High Performance Computing & Simulation(HPCS).IEEE,2013:233-233.
[6]TANG Y,WANG C.Performance modeling on DaVinci AI core[J].Journal of Parallel and Distributed Computing,2023,175:134-149.
[7]GAO W,ZHAO R C,HAN L,et al.Research on SIMD auto-vectorization compiling optimization[J].Journal of Software,2015,26(6):1265-1284.
[8]NUZMAN D,HENDERSON R.Multi-platform auto-vectorization[C]//International Symposium on Code Generation & Optimization.IEEE,2006.
[9]Free Software Foundation,Inc.GCC,the GNU compiler collection [EB/OL].(2022-12-23).https://gcc.gnu.org /.
[10]TAN H,CHEN H,SHENG L,et al.Modeling and evaluationfor gather/scatter operations in Vector-SIMD architectures[C]//2017 IEEE 28th International Conference on Application-specific Systems,Architectures and Processors(ASAP).IEEE,2017.
[11]HARPER III D T,LINEBARGER D A.Conflict-free vector access using a dynamic storage scheme[J].IEEE Transactions on Computers,1991,40(3):276-283.
[12]LEATHER H,CUMMINS C.Machine learning in compilers:Past,present and future[C]//2020 Forum for Specification and Design Languages(FDL).IEEE,2020:1-8.
[13]ASHOURI A H,KILLIAN W,CAVAZOS J,et al.A survey on compiler autotuning using machine learning[J].ACM Computing Surveys(CSUR),2018,51(5):1-42.
[14]SUI Y,FAN X,ZHOU H,et al.Loop-oriented pointer analysis for automatic simd vectorization[J].ACM Transactions on Embedded Computing Systems(TECS),2018,17(2):1-31.
[15]FENG J G,HE Y P,TAO Q M.Auto-vectorization:recent development and prospect[J].Journal on Communications,2022,43(3):180-195.
[16]NAISHLOS D.Auto vectorization in GCC[C]//Proceedings of the 2004 GCC Developers Summit.2004:105-118.
[17]RUDER S.An overview of gradient descent optimization algorithms[J].arXiv:1609.04747,2016.
[18]MALEKI S,GAO Y,MJ GARZARÁN,et al.An Evaluation of Vectorizing Compilers[C]//International Conference on Parallel Architectures & Compilation Techniques.IEEE,2015.
[19]STOCK K,POUCHET L N,SADAYAPPAN P.Using machine learning to improve automatic vectorization[J].ACM Transactions on Architecture and Code Optimization(TACO),2012,8(4):1-23.
[20]POHL A,COSENZA B,JUURLINK B.Vectorization cost mo-deling for NEON,AVX and SVE[J].Performance Evaluation,2020,140:102106.
[1] CHEN Shanshan, GAO Jun, MA Zhenyu. GDLIN:A Learned Index By Gradient Descent [J]. Computer Science, 2023, 50(6A): 220600256-6.
[2] LIN Zeyang, LAI Jun, CHEN Xiliang, WANG Jun. UAV Anti-tank Policy Training Model Based on Curriculum Reinforcement Learning [J]. Computer Science, 2023, 50(10): 214-222.
[3] YANG Li, LI Xin-yu, SHI Huai-feng, PAN Cheng-sheng. Task Intelligent Identification Method for Spatial Information Network [J]. Computer Science, 2020, 47(4): 262-269.
[4] LIU Xiao-tong,WANG Wei,LI Ze-yu,SHEN Si-wan,JIANG Xiao-ming. Recognition Algorithm of Red and White Cells in Urine Based on Improved BP Neural Network [J]. Computer Science, 2020, 47(2): 102-105.
[5] FENG Jin-zhan, CAI Shu-qin. Helpfulness Degree Prediction Model of Online Reviews Fusing Information Gain and Gradient Decline Algorithms [J]. Computer Science, 2020, 47(10): 69-74.
[6] ZHANG Xuan, JIANG Chao, LI Xiao-qiang, YAN Sha. Gradient Descent Bit-flipping Decoding Algorithm Based on Updating of Variable Nodes [J]. Computer Science, 2018, 45(8): 80-83.
[7] TAO Bing-mo,LU Shu-xia. Adaptive Stochastic Gradient Descent for Imbalanced Data Classification [J]. Computer Science, 2018, 45(6A): 487-492.
[8] CHEN Yong and XU Chao. Symbolic Execution and Human-Machine Interaction Based Auto Vectorization Method [J]. Computer Science, 2016, 43(Z6): 461-466.
[9] XU Ying,LI Chun-jiang,DONG Yu-shan and ZHOU Si-qi. Implementation of Auto-vectorization Based on Directives in GCC [J]. Computer Science, 2014, 41(Z11): 364-367.
[10] HU Wen-jun,WANG Juan,WANG Pei-liang and WANG Shi-tong. Fast Model of Ensembling Linear Support Vector Machines Suitable for Large Datasets [J]. Computer Science, 2014, 41(5): 245-249.
[11] SHEN Guo-feng,KONG Jun-jun,GUO Yao and CHEN Xiang-qun. Context Retrieval Cost Model on Smartphones and its Application [J]. Computer Science, 2014, 41(11): 132-136.
[12] SHAO Jie,DU Li-juan and YANG Jing-yu. Applications of XCSG in Multi-robot Reinforcement Learning [J]. Computer Science, 2013, 40(8): 249-251.
[13] DANG Jian-wu,HANG Li-hua,WANG Yang-ping and DU Xiao-gang. 2D-3D Medical Image Registration Based on GPU [J]. Computer Science, 2013, 40(4): 306-309.
[14] LI Chun-jiang,HUANG Juan-juan,XU Ying,DU Yun-fei and CHEN Juan. Evaluation and Analysis of Effects of Auto-vectorization in Typical Compilers [J]. Computer Science, 2013, 40(4): 41-46.
[15] . [J]. Computer Science, 2009, 36(5): 129-132.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!