计算机科学 ›› 2015, Vol. 42 ›› Issue (11): 240-247.doi: 10.11896/j.issn.1002-137X.2015.11.049
孙回回,赵荣彩,高伟,李雁冰
SUN Hui-hui, ZHAO Rong-cai, GAO Wei and LI Yan-bing
摘要: 现代编译器越来越依赖SIMD指令来提高向量化性能,但控制流的复杂性严重阻碍了SIMD向量化的发掘。现有的控制流向量化方法对于单层控制流的向量化很有效,但对嵌套等复杂控制流无法取得令人满意的效果。因此,提出了一种基于条件分类的控制流向量化方法。该方法对条件为循环不变量的控制流,以层次遍历的顺序实施IF外提;对条件为循环变量的控制流,结合语句匹配和条件合并递归地进行IF转换,生成相应的SIMD指令,从而实现嵌套控制流的向量化。实验结果表明,该方法能够有效消除循环中的嵌套控制流,提高向量化发掘的能力, 有效提升 测试程序的性能。
[1] Sreraman N,Govindarajan R.A Vectorizing Compiler for Multimedia Extensions[J].International Journal of Parallel Programming,2000,28(4):363-400 [2] Larsen S,Amarasinghe S.Exploiting Superword Level Paralleli-sm with Multimedia Instruction Sets[C]∥Conference on Programming Language Design and Implementation.2000:145-156 [3] Allen F E,Cocke J.A Catalogue of Optimizing Transformations[M]∥Rustin R,ed.Design and Optimization of Compilers.Prentice-Hall,Englewood Cliffs,1972:1-30 [4] Allen J,Kennedy K,Porterfield C,et al.Conversion of Control Dependence to Data Dependence[C]∥Annual Symposium on Principles of Programming Languages.1983:177-189 [5] Wald I,Leiβa R,Hack S.Extending a C-like Language for Portable SIMD Programming[C]∥Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practices of Parallel Programming (PPoPP).2012 [6] Stock K,Poudhet L.Using Machine Learning to Improve Automatic Vectorization[J].ACM Transactions on Architecture and Code Optimization (TACO),2012,8(4) [7] Vasilache N,Meister B,Baskaran M,et al.Joint Scheduling and Layout Optimization to Enable Multi-level Vectorization[C]∥Proceedings of the International Workshop on Polyhedral Compilation Techniques (IMPACT).2012 [8] Kong M,Pouchet L-N,Sadayappan P.Abstract Vector SIMDCode Generation Using the Polyhedral Model :Technical Report Technical Report 4/13-TR08[R].Ohio State University,2013 [9] Barik R,Zhao Ji-sheng,Sarkar V.Efficient Selection of Vector Instructions using Dynamic Programming[C]∥Proceedings of the 43rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO).2010 [10] Liu Jun,Zhang Yuan-rui,Kandemir M.A Compiler Framework for Extracting Superword Level Parallelism[C]∥Proceedings of the 2012 Conference on Programming Language Design and Implementation (PLDI).2012 [11] Park Yong-jun,Park H,Cho H K,et al.SIMD Defragmenter:Efficient ILP Realization on Data-parallel Architectures[C]∥Proceedings of the 17th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS).2012 [12] Cooper K D,Torczon L.Engineering a Compiler[M].San Francisco: Morgan Kaufmann,2004 [13] Lokuciejewski P,Gedikli F,Marwedel P.Accelerating WCET-driven Optimizations by the Invariant Path Paradigm:A Case Study of Loop Unswitching[C]∥Proceedings of the 12th International Workshop on Software and Compilers for Embedded Systems.Nice,France,2009:23-24 [14] Barton C,Tal A,Blainey B,et al.Generalized Index-set Splitting[C]∥Proceedings of the 14th international conference on Compiler Construction (CC’05).Berlin,Heidelberg,2005:106-120 [15] Allen R,Kennedy K.Optimizing Compilers for Modern Archi-tectures:A Dependence-based Approach[M].Morgan Kaufmann Publishers,2001 [16] Park J,Schlansker M.On Predicated Execution[R/OL].1991.http://www.hpl.hp.com/techreports/91/HPL-91-58.pdf [17] Bik A J C,Girkar M,Grey P M,et al.Automatic Intra-register Vectorization for the Intel Architecture[J].International Journal of Parallel Programming,2002,30(2):65-98 [18] Shin J,Hall M,Chame J.Superword-Level Parallelism in thePresence of Control Flow[C]∥Proceedings of the International Symposium on Code Generation and Optimization(CGO’05).Washington,DC,USA,2005:165-175 [19] Shin J.Introducing Control Flow into Vectorized Code[C]∥Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques(PACT’07).Washington,DC,USA,2007:280-291 [20] Shin J,Hall M W,Chame J.Evaluating Compiler Technology for Control-flow Optimizations for Multimedia Extension Architectures[C]∥In 6th Workshop on Media and Streaming Processors.2009 [21] Zhu Jia-feng,Zhao Rong-cai.A Vectorization Method of Export Branch for SIMD Extension[C]∥Proceedings of the 10th conference IEEE/ACIS International Conference on Computer and Information Science (ICIS).2011 [22] Tanaka H,Ota Y,Matsumoto N,et al.A New CompilationTechnique for SIMD Code Generation across Basic Block Boun-daries[C]∥Proceedings of the 15th Asia and South Pacific Design Automation Conference (ASP-DAC).2010:101-106 [23] Karrenberg R,Hack S.Whole Function Vectorization[C]∥2011 9th Annual IEEE/ACM International Symposium on Code Ge-neration and Optimization (CGO).2011:141-150 [24] Kong M,Veras R,Stock K.When Polyhedral Transformations Meet SIMD Code Generation[C]∥Proceedings of the 2013 Conference on Programming Language Design and Implementation (PLDI).2013 [25] Sujon M H,Whaley R C,Yi Qing.Vectorization Past Dependent Branches Through Speculation[C]∥Parallel Architecture and Compilation Techniques(PACT’13).2013 [26] Smith J E,Faanes G,Sugumar R.Vector Instruction Set Support for Conditional Operations[C]∥International Symposium on Computer Architecture.ACM,2000 |
No related articles found! |
|