Computer Science ›› 2013, Vol. 40 ›› Issue (10): 24-28.

Previous Articles     Next Articles

SLP Optimization Algorithm Using Across Basic Block Transformation and Loop Distribution

SUO Wei-yi,ZHAO Rong-cai,YAO Yuan and ZHANG Xiao-mei   

  • Online:2018-11-16 Published:2018-11-16

Abstract: The existing SLP algorithms cannot handle dependent ring and the reduction of the inner loop,and generate a large number of redundant packet disassembly and assignment statements in a basic block boundary,which leads to the lower quantization efficiency.In order to solve the problem,this paper proposed a SLP optimization algorithm using cross basic block transformation and loop distribution.Based on the control flow graph,according to the basic blocks of the array variable between Define-Use and across basic block data relation between across basic block,the algorithm makes the quantized transform,orderly uses across basic block transform and loop distribution,and then expands inner loop within a basic block sentence parallelism as far as possible,making SLP automatic vectorization compiler to generate the vectorization code which has more SIMD instruction.The experimental results show that the algorithm can hide more across basic block redundancy operation cost,at the same time generate better SIMD instructions across basic block data dependence,effectively improving the vectorization program speedup.

Key words: SLP,Cross basic block,Loop distribution,Data dependence,Control flow graph,Define-Use relationship

[1] Franchetti F,Kral S,Lorenz J,et al.Efficiem utilization of SIMD extensions[J].Proceedings ofthe IEEE,2005,93(2):409-425
[2] TMS320C6000CPU and Instruction Set Reference Guide(Rev.F)[M].TexasInstruments Inc.2000
[3] SC140DSP Core Reference Manual[R/OL].http:// cache.freescale.com/files/dsp/doc/ref_manual/MNSC140CORE.pdf,2012-05-20
[4] Fridman J,Greenfield Z.The Tiger SHARC DSP Architecture[J].IEEE Micro,2000,0(1):66-76
[5] Tanaka H,Ota Y,Matsumoto N,et al.A New CompilationTechnique for SIMD Code Generation Across Basic Block Boundaries[C]∥Design Automation Conference(ASP-DAC),201015th Asia and South Pacifi.Jan.2010:101-106
[6] Larsen S,Amarasinghe S.Exploiting superword level parallelism with multimedia instruction sets[C]∥Proc of the ACM SIGPLAN Conference on Programming Language Design and Implementation.June 2000:145-156
[7] Shin J,Hall M,Chame J.Superword-level Parallelism in thePresence of Control Flow[C]∥Proc.of the International Symposium on Code Generation and optimization.March 2005:165-175
[8] Nuzman D,Zaks A.Outer-loop vectorization:revisited for short simd architectures[C]∥Proceedings of the 17th international conference on parallel architectures and compilation techniques,PACT ’08.New York,NY,USA,ACM,2008:2-11
[9] Aho A V,Lam M S,Sethi R,et al.编译原理[M].赵建华,郑滔,戴新宇,译.北京:机械工业出版社,2009
[10] 陈火旺,刘春林,谭庆平,等.程序设计语言编译原理(第3版)[M].北京:国防工业出版社,2001
[11] Allen F E,Cocke J.A program data flow analysis procedure[J].Communications of ACM,1976,19(3)
[12] Open64[EB/OL].http://open64.sourceforge.net,2012-09-10

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!