计算机科学 ›› 2018, Vol. 45 ›› Issue (9): 220-223.doi: 10.11896/j.issn.1002-137X.2018.09.036

• 软件与数据库技术 • 上一篇    下一篇

面向间接数组索引的向量化方法

姚金阳, 赵荣彩, 王琦, 李颖颖   

  1. 数学工程与先进计算国家重点实验室 郑州450001
  • 收稿日期:2017-11-07 出版日期:2018-09-20 发布日期:2018-10-10
  • 作者简介:姚金阳(1992-),男,硕士生,主要研究方向为高性能计算、先进编译,E-mail:yaojy1024@126.com;赵荣彩(1957-),男,博士,教授,CCF高级会员,主要研究方向为高性能计算、先进编译、反编译技术;王 琦(1992-),男,硕士生,主要研究方向为高性能计算、先进编译;李颖颖(1984-),女,博士生,讲师,主要研究方向为高性能计算、先进编译。
  • 基金资助:
    本文受国家重点研发计划“高性能计算”重点专项(2016YFB0200503)资助。

Vectorization Methods for Indirect Array Index

YAO Jin-yang, ZHAO Rong-cai, WANG Qi, LI Ying-ying   

  1. State Key Laboratory of Mathematical Engineering and Advanced Computing,Zhengzhou 450001,China
  • Received:2017-11-07 Online:2018-09-20 Published:2018-10-10

摘要: 对现有的编译器而言,间接数组索引不能被高效地向量化,这使得程序中包含有该类访存形式的间接数组索引不能利用SIMD扩展部件,这也是程序向量化研究中的热点问题。为了高效地利用SIMD扩展部件,充分挖掘程序中的向量化潜能,提出了一种对间接数组索引进行向量化的新方法,且提供了性能收益方法,分别对各种间接数组索引进行性能收益分析。实验结果表明,使用该向量化方法可以显著地提高程序的执行效率。

关键词: 间接数组索引, 临时数组, 收益分析, 向量化

Abstract: Indirect array index cannot bevectorized efficiently in the existing compiler.It makes the program which contains the indirect array index cannot take advantage of SIMD extension parts.It is a hot topic in research on procedure vectorization.In order to utilize the SIMD extension parts efficiently and excavate the vectorization potential in the program fully,a new vectorization method for indirect array index was proposed in this paper.The performance income method was provided so as to analyze the performance benefits for various indirect arrays index.The experimental results show that the vectorization method can significantly improve the efficiency of the execution of program.

Key words: Cost-benefit analysis, Indirect array index, Temporary array, Vectorization

中图分类号: 

  • TP311
[1]GAO W,ZHAO R C,HAN L,et al.Research on SIMD auto-vectorization compiling optimization[J].Journal of Software,2015,26(6):1265-1284.(in Chinese)
高伟,赵荣彩,韩林,等.SIMD自动向量化编译优化概述[J].软件学报,2015,26(6):1265-1284.
[2]BIK A J C,GIRKAR M,GREY P M,et al.Automatic Intra-Register Vectorization for the Intel® Architecture[J].International Journal of Parallel Programming,2002,30(2):65-98.
[3]ZHU Q S.Improving Program Performance via Auto-Vectorization of Loops with Conditional Statements with GCC Compiler Setting[J].Applied Mechanics & Materials,2013,433-435:1410-1414.
[4]YIU J.Getting Started with the ARM RealView Development
Suite[M]∥The Definitive Guide to the ARM Cortex-M0.Elsevier Inc.,2011:361-384.
[5]KANSAL R,KUMAR S.A vectorization framework for con-stant and linear gradient filled regions[J].The Visual Compu-ter,2015,31(5):717-732.
[6]CHEN J Z,LEI Q,MIAO Y W,et al.Vectorization of line dra-wing image based on junction analysis[J].Science China Information Sciences,2015,58(7):1-14.
[7]SUI Y,FAN X,ZHOU H,et al.Loop-oriented array-and field-sensitive pointer analysis for automatic SIMD vectorization[J].Acm Sigplan Notices,2016,51(5):41-51.
[8]WEINHARDT M,LUK W.Pipeline vectorization[J].IEEE
Transactions on Computer-aided Design of Integrated Circuits and Systems,2001,20(2):234-248.
[9]KENNEDY K,ALLEN J R.Optimizing compilers for modern architectures:a dependence-based approach[M]∥Optimizing compilers for modern architectures.Morgan Kaufmann Publi-shers,2002.
[10]CHANG H,SUNG W.Efficient vectorization of SIMD programs with non-aligned and irregular data access hardware[C]∥Proceedings of the 2008 International Conference on Compilers,Architectures and Synthesis for Embedded Systems.ACM,2008:107-176.
[11]LI P,ZHAO R,ZHANG Q,et al.An SIMD Code Generation
Technology for Indirect Array[J].International Journal of Computer Theory and Engineering,2016,8(3):218-222.
[12]ALEEN F,ZAKHARIN V P,KRISHANIYER R,et al.Automated compiler optimization of multiple vector loads/stores[J].International Journal of Parallel Programming,2018,46(2):47-503.
[13]KIM S,HAN H.Efficient SIMD code generation for irregular
kernels[J].Acm Sigplan Notices,2012,47(8):55-64.
[14]WEI S,ZHAO R C,YAO Y,et al.Data Regroup Alignment Optimization Based on SIMD [J].Chinese Journal of Computers,2012,39(2):305-310.(in Chinese)
魏帅,赵荣彩,姚远,等.面向SIMD的数组重组和对齐优化[J].计算机科学,2012,39(2):305-310.
[15]YU H N,HAN L,LI P Y,et al.Structure Optimization for Automatic Vectorization [J].Chinese Journal of Computers,2016,43(2):210-215.(in Chinese)
于海宁,韩林,李鹏远,等.面向自动向量化的结构体优化[J].计算机科学,2016,43(2):210-215.
[16]LARSEN S,AMARASINGHE S.Exploiting superword level
parallelism with multimedia instruction sets[J].Acm Sigplan Notices,2000,35(5):145-156.
[17]NUZMAN D.loop aware SLP in GCC[C]∥GCC Developers Summit.2007.
[1] 徐启泽, 韩文廷, 陈俊仕, 安虹.
众核平台上广度优先搜索算法的优化
Optimization of Breadth-first Search Algorithm Based on Many-core Platform
计算机科学, 2019, 46(1): 314-319. https://doi.org/10.11896/j.issn.1002-137X.2019.01.049
[2] 周蓓, 黄永忠, 许瑾晨, 郭绍忠.
向量数学库的向量化方法研究
Study on SIMD Method of Vector Math Library
计算机科学, 2019, 46(1): 320-324. https://doi.org/10.11896/j.issn.1002-137X.2019.01.050
[3] 赵澄, 陈君新, 姚明海.
基于SVM分类器的XSS攻击检测技术
XSS Attack Detection Technology Based on SVM Classifier
计算机科学, 2018, 45(11A): 356-360.
[4] 吴卫祖,刘利群,谢冬青.
基于神经网络的异构网络向量化表示方法
Vectorized Representation of Heterogeneous Network Based on Neural Networks
计算机科学, 2017, 44(5): 272-275. https://doi.org/10.11896/j.issn.1002-137X.2017.05.049
[5] 郝鑫,郭绍忠.
基于Intel MIC架构的3D有限差分算法优化
Optimization of 3D Finite Difference Algorithm on Intel MIC
计算机科学, 2017, 44(5): 26-32. https://doi.org/10.11896/j.issn.1002-137X.2017.05.005
[6] 韩林,徐金龙,李颖颖,王阳.
面向部分向量化的循环分布及聚合优化
Method of Loop Distribution and Aggregation for Partial Vectorization
计算机科学, 2017, 44(2): 70-74. https://doi.org/10.11896/j.issn.1002-137X.2017.02.008
[7] 陈勇,徐超.
基于符号执行和人机交互的自动向量化方法
Symbolic Execution and Human-Machine Interaction Based Auto Vectorization Method
计算机科学, 2016, 43(Z6): 461-466. https://doi.org/10.11896/j.issn.1002-137X.2016.6A.109
[8] 于海宁,韩林,李鹏远.
面向自动向量化的结构体优化
Structure Optimization for Automatic Vectorization
计算机科学, 2016, 43(2): 210-215. https://doi.org/10.11896/j.issn.1002-137X.2016.02.045
[9] 徐金龙 赵荣彩 赵 博.
SIMD向量指令的非满载使用方法研究
Research on Non-full Length Usage of SIMD Vector Instruction
计算机科学, 2015, 42(7): 229-233. https://doi.org/10.11896/j.issn.1002-137X.2015.07.049
[10] 李朋远,赵荣彩,高 伟,张庆花.
一种支持跨幅访存的向量化代码生成方法
Effective Vectorization Technique for Interleaved Data with Constant Strides
计算机科学, 2015, 42(5): 194-199. https://doi.org/10.11896/j.issn.1002-137X.2015.05.039
[11] 刘 鹏,赵荣彩,李朋远.
一种面向向量化的动态指针别名分析框架
Dynamic Pointer Alias Analysis Framework for Vectorization
计算机科学, 2015, 42(3): 26-30. https://doi.org/10.11896/j.issn.1002-137X.2015.03.005
[12] 徐金龙,赵荣彩,徐晓燕.
SIMD代码中的向量访存优化研究
Memory Access Optimization for Vector Program of SIMD Form
计算机科学, 2015, 42(12): 18-22.
[13] 孙回回,赵荣彩,高伟,李雁冰.
基于条件分类的控制流向量化
Control Flow Vectorization Based on Conditions Classification
计算机科学, 2015, 42(11): 240-247. https://doi.org/10.11896/j.issn.1002-137X.2015.11.049
[14] 徐颖,李春江,董钰山,周思齐.
GCC编译器中编译指导的自动向量化实现
Implementation of Auto-vectorization Based on Directives in GCC
计算机科学, 2014, 41(Z11): 364-367.
[15] 刘鹏,赵荣彩,赵博,高伟.
一种面向SIMD扩展部件的向量化统一架构
Unified Vectorization Framework for SIMD Extensions
计算机科学, 2014, 41(9): 28-31. https://doi.org/10.11896/j.issn.1002-137X.2014.09.004
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!