Computer Science ›› 2015, Vol. 42 ›› Issue (12): 18-22.

Previous Articles     Next Articles

Memory Access Optimization for Vector Program of SIMD Form

XU Jin-long, ZHAO Rong-cai and XU Xiao-yan   

  • Online:2018-11-14 Published:2018-11-14

Abstract: There are two ways to get vector program,one is handwritten,and the other is generated automatically by the compiler.Limited to programmers and parallel compiler’ ability,there always is some optimization space in vector program.The optimizing compiler concernes most about how to transform the serial program into vector form,rarely do further optimization after the vector form generating.We proposed a memory access optimization method for vector program of SIMD form.Firstly it determines that whether the program needs to be optimized.If optimization is needed,redundancy optimization and align optimization will be implemented for the vector form program.Experimental data show that the proposed method can significantly improve the running efficiency of the program,and the goal is achieved.

Key words: Vectorization,Single instruction multiple data,Memory access redundant,Alignment optimization

[1] 李春江,黄娟娟,徐颖,等.典型编译器自动向量化效果评估与分析[J].计算机科学,2013,40(4):41-46 Li Chun-jiang,Huang Juan-juan,Xu Ying,et al.Evaluation and Analysis of Effects of Auto-vectorization in Typical Complier[J].Computer Science,2013,40(4):41-46
[2] Allen R,Kennedy K.现代体系结构的优化编译器[M].张兆庆,乔如良,冯晓兵,等译.北京:机械工业出版社,2004 Allen R,Kennedy K.Optimizing compilers modern architectures [M].Zhang Zhao-qin,Qiao Ru-liang,Feng Xiao-bing,et al,eds.Beijing:China Machine Press,2004
[3] Larsen S,Amarasinghe S.Exploiting superword level parallelism with multimedia instruction sets[C]∥Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation.2000:145-156
[4] Boekhold M,Karkowski I,Corporaal H.Transforming and parallelizing ANSI C programs using pattern recognition[C]∥Lecture Notes in Computer Science.1999
[5] Manniesing R,Karkowski I,Corporaal H.Automatic SIMD parallelization of embedded applications based on pattern recognition[C]∥Proceedings of 6th International Euro-Par Confe-rence.2000:349-356
[6] Henretty T,Veras R,Franchetti F,et al.A stencil compiler for short-vector simd architectures[C]∥Proceedings of the 27th International ACM Conference on International Conference on Supercomputing.ACM,2013:13-24
[7] Kong M,Veras R,Stock K,et al.When polyhedral transformations meet SIMD code generation[J].ACM SIGPLAN Notices,2013,48(6):127-138
[8] Bondhugula U,Gunluk O,Dash S,et al.A model for fusion and code motion in an automatic parallelizing compiler[C]∥Proceedings of the 19th international conference on Parallel architectures and compilation techniques.ACM,2010:343-352
[9] Rosen I,Nuzman D,Zaks A.Loop-aware SLP in GCC[C]∥GCC summit.2007:131-142
[10] Nuzman D,Rosen I,Zaks A.Auto-vectorization of interleaveddata for SIMD[J]∥ACM SIGPLAN Notices,2006,41(6):132-143
[11] 何颂颂,顾乃杰,任开新.一种面向数据密集型应用的并行程序执行模型[J].小型微型计算机系统,2013,34(7):1457-1461 He Song-song,Gu Nai-jie,Ren Kai-xin.Parallel Program Execution Model for Data-intensive Applications[J].Journal of Chinese Computer Systems,2013,34(7):1457-1461
[12] Open64.Overview of the open64 Compiler Infrastructure[EB/OL].http://open64.sourceforge.net,2006

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!