Computer Science ›› 2025, Vol. 52 ›› Issue (9): 186-194.doi: 10.11896/jsjkx.241100130
• High Performance Computing • Previous Articles Next Articles
HAN Lin1,2, DING Yongqiang1, CUI Pingfei1, LIU Haohao2, LI Haoran2, CHEN Mengyao2
CLC Number:
[1]GAO W,ZHAO R C,HAN L,et al.Research on SIMD auto-vectorization compiling optimization[J].Ruan Jian Xue Bao/Journal of Software,2015,26(6):1265-1284. [2]FENG J G,HE Y P,TAO Q M.Auto-vectorization:Recent de-velopment and prospect[J].Journal on Communications,2022,43(3):180-119. [3]LIU H H,HAN L,CUI P F.Insufficient SLP in GCC[J].Computer Systems & Applications,2022,31(9):265-271. [4]VENKATESAN A,BANERJEE K,BHATTACHARJEE A,et al.Deep learning inference on ARM:A survey of compute li-braries and quantization techniques[J].ACM Transactions on Embedded Computing Systems,2020,19(1). [5]HAN S,MAO H,DALLY W J.Neural network accelerationwith efficient floating-point SIMD on FPGAs[C]//2016 IEEE International Solid-State Circuits Conference.IEEE,2016:122-123. [6]NVIDIA Corporation.Tensor Cores enable high-performanceFP16 inference on NVIDIA Volta GPUs[EB/OL].https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/tensor-core-whitepaper.pdf. [7]AMIRI H,SHAHBAHRAMI A.SIMD programming using Intel vector extensions[J].Journal of Parallel and Distributed Computing,2020,135:83-100. [8]STOJANOV A,TOSKOV I,ROMPF T,et al.SIMD intrinsics on managed language runtimes[C]//Proceedings of the 2018 International Symposium on Code Generation and Optimization.2018:2-15. [9]LI J N,HAN L,CHAI G D.Automatic Vectorization Transplant and Optimization of LLVM for Domestic Processors[J].Computer Engineering,2022,48(1):142-148. [10]NUZMAN D,ZAKS A.Outer-loop vectorization-revisited forshort SIMD architectures[C]//Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques.2008. [11]HE T.An overview of compilation and optimization of automatic vector quantization based on data leve[J].Intelligent Computer and Application,2016,6(6):68-71. [12]LARSEN S,AMARASINGHE S.Exploiting superword levelparallelism with multimedia instruction sets[J].Programming Language Design and Implementation,2000,35(5):145-156. [13]ZHAO J,ZHAO R C.Identifying superword level parallelism with directed graph reachability[J].Scientia Sinica(Informationis),2017,47:310-325. [14]PORPODAS V,MAGNI A,JONES T M.PSLP:Padded SLP automatic vectorization[C]//Proceedings of the 13th Annual IEEE/ACM International Symposium on Code Generation and Optimization.2015:190-201. [15]FENG J,HE Y,TAO Q,et al.An SLP Vectorization MethodBased on Equivalent Extended Transformation[J].Wireless Communications and Mobile Computing,2022,2022(1):1832522. [16]FENG J G,HE Y P,TAO Q M,et al.SLP Vectorization MethodBased on Multiple Isomorphic Transformations[J].Journal of Computer Research and Development,2023,60(12):2907-2927. [17]ZHANG S P,WANG D,DING L L,et al.New framework based on SLP[J].Application Research of Computers,2017,34(1):21-26. [18]LI Y Y,XI H X,GAO W,et al.SLP vectorization method based on throttling[J].Application Research of Computer,2018,35(9):2578-2582. [19]XU J L,ZHAO R C,HAN L,et al.SIMD Code Selection Methodfor Inter-Basic-Block[J].Journal of Information Engineering University,2016,17(2):244-249. [20]CHEN Y S,MENDIS C,AMARASINGHE S.All You Need Is Superword-Level Parallelism:Systematic Control-Flow Vectorization with SLP[C]//Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation(PLDI ’22).New York:ACM,2022:301-315. [21]YE Z,JIAO J.Loop Unrolling Based on SLP and Register Pressure Awareness[C]//2024 20th International Conference on Natural Computation,Fuzzy Systems and Knowledge Discovery(ICNC-FSKD).2024:1-6. [22]LI J,GAO W,LI Y,et al.An Improved Method for Control Dependency in LLVM[C]//2024 5th International Conference on Intelligent Computing and Human-Computer Interaction(ICHCI).2024:291-294. [23]CHEN M Y,NEI K,LI J N,et al.An SLP automatic vectorization method,apparatus and electronic device:CN202311666914.7[P].2024-03-05. [24]TAYEB H,PAILLAT L,BRAMAS B.Autovesk:AutomaticVectorized Code Generation from Unstructured Static Kernels Using Graph Transformations[J].ACM Transactions on Architecture and Code Optimization,2023,21(1):1-25. |
[1] | LIAO Zeming, LIU Guikai, HU Yonghua, XIE Anxing. Research on Efficient Code Generation Techniques for Array Computation for Vector DSPs [J]. Computer Science, 2025, 52(6A): 240300156-7. |
[2] | JIANG Jun, ZHAI Yanhe, ZENG Zhiheng, GU Yichao, HUANG Liangming. Loop-invariant Code Motion Algorithm Based on Loop Cost Analysis [J]. Computer Science, 2025, 52(6): 44-51. |
[3] | LIU Lili, SHAN Zheng, LI Yingying, WU Wenhao, LIU Wenbo. Research on Function Vectorization Technology Based on Directive Statements [J]. Computer Science, 2025, 52(5): 76-82. |
[4] | PEI Xue, WEI Shuai, SHAO Yangxue, YU Hong, GE Chenyang. Compilation Optimization and Implementation of High-order Cryptographic Operators on FPGA [J]. Computer Science, 2024, 51(11A): 231200184-11. |
[5] | FAN Lilin, QIAO Yihang, LI Junfei, CHAI Xuqing, CUI Rongpei, HAN Bingyu. CP2K Software Porting and Optimization Based on Domestic c86 Processor [J]. Computer Science, 2023, 50(6): 58-65. |
[6] | CHI Hao-yu, CHEN Chang-bo. Prediction of Loop Tiling Size Based on Neural Network [J]. Computer Science, 2020, 47(8): 62-70. |
[7] | ZHAO Bo,ZHAO Rong-cai,LI Yan-bing and GAO Wei. SLP Exploitation Method for Type Conversion Statements [J]. Computer Science, 2014, 41(11): 16-21. |
[8] | SUO Wei-yi,ZHAO Rong-cai,YAO Yuan and ZHANG Xiao-mei. SLP Optimization Algorithm Using Across Basic Block Transformation and Loop Distribution [J]. Computer Science, 2013, 40(10): 24-28. |
[9] | . [J]. Computer Science, 2009, 36(3): 45-47. |
|