计算机科学 ›› 2017, Vol. 44 ›› Issue (2): 70-74.doi: 10.11896/j.issn.1002-137X.2017.02.008
韩林,徐金龙,李颖颖,王阳
HAN Lin, XU Jin-long, LI Ying-ying and WANG Yang
摘要: 大量循环中都存在着少数无法向量化的语句以及许多可向量化语句,循环分布通常可以将这些语句分离到不同的循环中,进而实现循环的部分向量化。目前主流的优化编译器仅支持简单激进的循环分布方法,因而导致向量化后的循环开销过大,且不利于寄存器和cache的重用。针对上述问题,提出了面向部分向量化的循环分布及聚合方法。首先,分析了一般循环分布的两个关键问题:语句集的划分和循环执行顺序的确定;其次,提出了面向最大聚合的凝聚图结点排序方法来指导循环合并,在不影响并行性的前提下减小了循环开销;最后,通过实验对提出的方法进行了验证。实验结果表明,对于测试用例,提出的方法能够生成正确的向量化代码,并且能够显著提高向量化程序的执行效率。
[1] KENNEDY K,MCKINLEY K S.Loop distribution with arbitrary control flow[C]∥Proceedings of Supercomputing’90.IEEE,1990:407-416. [2] MCKINLEY ,KEN K,KATHRYN S.Maximizing Loop Parallelism and Improving Data Locality via Loop Fusion and Distribution[M]∥Languages and Compilers for Parallel Computing.Springer Berlin Heidelberg,1997:301-320. [3] LARSEN S,AMARASINGHE S.Exploiting superword levelparallelism with multimedia instruction sets[C]∥Proceedings of the SIGPLAN’00 Conference on Programming Language Design and Implementation,2000:145-156. [4] PARK Y,SEO S,PRAK H,et al.Simd defragmenter:efficient ilp realization on data-parallel architectures[J].ACM SIGARCH Computer Architecture News.ACM,2012,40(1):363-374. [5] BARIK R,ZHAO J,SARKAR V.Efficient selection of vectorinstructions using dynamic programming[C]∥2010 43rd An-nual IEEE/ACM International Symposium on Microarchitecture (MICRO).IEEE,2010:201-212. [6] KIM S,HAN H.Efficient SIMD code generation for irregularkernels[J].ACM Sigplan Notices,2012,47(8):55-64. [7] LIU J,ZHANG Y,JANG O,et al.A compiler framework for extracting superword level parallelism[J].ACM Sigplan Notices,2012,47(6):347-357. [8] RAMANARAYANAN R,GUPTA M,C HAKRABORTY S S,et al.Harnessing partial vectorization in Open64 compiler[C]∥2014 IEEE International Advance Computing Conference (IACC).IEEE,2014:813-824. [9] ALLEN R,KENNEDY K.Optimizing compilers for modern ar-chitectures:a dependence-based approach[M].San Francisco:Morgan Kaufmann,2002. [10] GCC Team.Gcc,the gnu compiler collection.http://gcc.gnu.org. [11] Intel Corporation.Intel C and C++ Compilers.https://software.intel.com/en-us/intel-compilers. [12] Open64.Overview of the open64 Compiler Infrastructure[EB/OL].http://open64.sourceforge.net. [13] CHEN K H,Shen B Y,Yang W.An automatic superword vectorization in LLVM[C]∥16th Workshop on Compiler Techniques for High-Performance and Embedded Computing.2010:19-27. |
No related articles found! |
|