Computer Science ›› 2014, Vol. 41 ›› Issue (5): 27-32.doi: 10.11896/j.issn.1002-137X.2014.05.006

Previous Articles     Next Articles

Research on SIMD-oriented Loop Optimizations

HOU Yong-sheng,ZHAO Rong-cai,HUANG Lei and HAN Lin   

  • Online:2018-11-14 Published:2018-11-14

Abstract: Accelerating program performance via SIMD vector is very common in modern processors.Based on SIMD data dependency constraints,we presented the design and implementation of the improved Tarjan algorithm with reverse dependency graph to increase recognition rate of SIMD parallelization.And in combination with traditional vectorization method,the SIMD-oriented loop optimization technology was proposed that avoids unnecessary overhead in data reorganization.Our experimental results demonstrate that the performance of generated SIMD using the technology code is increased by 12% compared with ICC.

Key words: SIMD,Dependence analysis,Loop optimization,Tarjan

[1] Larsen S,Amarasinghe S.Exploiting superword level parallelism with multimedia instruction sets[C]∥Conference on Programming Language Design and Implementation.Vancouver,BCCanada,June 2000:145-156
[2] Intel.IA-32Intel Architecture Optimization Reference Manual.http://www.intel.com/products/processor/manuals/index.htm,2005
[3] Franchetti F,Kral S,Lorenz J,et al.Efficientutilization of SIMD extensions[J].Proc.IEEE,2005,3(2):409-425
[4] Al-Shayka O,Chen S,Mattavelli M.Introduction to the special issue on multimedia implementation[J].IEEE Transactions on Circuits and System for Video Technology,2002,2(8):629-632
[5] Wu Peng,Eichenberger A E,Wang A.Efficient SIMD CodeGeneration for Runtime Alignment and Length Conversion[C]∥Proceedings of the International Symposium on Code Generation and Optimization.March 2005:153-164
[6] Wu Peng,Eichenberger A E,Wang A,et al.An integrated simdization framework using virtual vectors[C]∥Proceedings of the 19th Annual International Conference on Supercomputing.June 2005:169-178
[7] Kennedy K,Allen J R.Optimizing compilers for modern archi-tectures:a dependence-based approach[M].San Francisco CA,USA:Morgan Kaufmann Publishers Inc,2002
[8] Bacon D F,Graham S L,Sharp O J.Compiler Transformations for High-Performance Computing[J].ACM Computing Surveys,1994,26(4)
[9] Tarjan R E.Depth-first search and linear graph algorithms[J].SIAM Journal on Computing,1972,1(2):146-160
[10] Fournier J J A,Moore S W.A vector approach to cryptography implementation[M]∥Digital Rights Management.Technologyies,issues,challenges and systems.Springer Berlin Heidlberg,2005:277-297
[11] Franchetti F.A portable short vector version of FFTW [C]∥Proc.Fourth IMACS Symposium on Mathematical Modeling (MATHMOD 2003).2003,2:1539-1548
[12] Slingerland N T,Smith A J.Measuring instruction sets for ge-neral purpose microprocessors:A survey [R].CSD-00-1122.University of California at Berkeley Technical Report,2000
[13] Shin J,Hall M,Chame J.Superword-level parallelism in thepresence of control flow[C]∥International Symposium on Code Generation and Optimization.2005:165-175
[14] Feld D,Soddemann T,Jünger M,et al.Facilitate SIMD-Code-Generation in the Polyhedral Model by Hardware-aware Automatic Code-Transformation[C]∥Proceeding of the 3rd International Workshop on Polyhedral Compilation Techniques.2013:45-54
[15] Eichenberger A,Wu P,O’Brien K.Vectorization for simd architectures with alignment constraints[C]∥PLDI.2004
[16] Nuzman D,Rosen I,Zaks A.Auto-vectorization of interleaveddata for simd[C]∥PLDI.2006
[17] Kong M,Veras R,Stock K,et al.Polyhedral TransformationsMeet SIMD Code Generation[C]∥PLID.I2013
[18] Nuzman D,Zaks A.Outer-loop vectorization:revisited for short simd architectures[C]∥PACT.2008
[19] Chen C,Chame J,Hall M.Chill:A framework for composinghigh-level loop transformations[R].Technical Report 08-897.USC Computer Science Technical Report.2008
[20] Trifunovic K,Nuzman D,Cohen A,et al.Polyhedral-modelguided loop-nest auto-vectorization[C]∥PACT.Sept.2009
[21] Vasilache N,Meister B,Baskaran M,et al.Joint scheduling and layout optimization to enable multi-level vectorization[C]∥Proc.of IMPACT’12.Jan.2012

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!