自动并行化中不规则循环的代码生成

Abstract

Abstract: Many large-scale scientific applications contain irregular loops．But the prior work of automatic parallelization on distributed memory is hard to generate parallel code for irregular loops at compile-time．We proposed an approach for effective code generation of a common class of irregular loops,and it’s able to transform the serial code of irregular loops to equal parallel computation and communication code at compile-time．The approach searches the local definition set of the loops on each processor by computation decomposition and access expression of array references,and satisfies the producer-consumer relation of irregular array references by partial communication redundancy．The experimental results show that our approach is valid and improves the speedup of test applications.

Key words: Automatic parallelization,Computation decomposition,Irregular loops,Partial redundancy

DING Rui,ZHAO Rong-cai,XU Jin-long and FU Li-guo. Code Generation for Automatic Parallelization of Irregular Loops[J].Computer Science, 2013, 40(12): 9-14.

References

[1] Ancourt C,Irigoin F．Scanning polyhedra with do loops[C]∥Proceedings of the third ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPOPP’91),1991．NY,USA:ACM New York,1991:39-50
[2] Amarasinghe S P,Lam M S．Communication optimization andcode generation for distributed memory machines[C]∥Procee-dings of The ACM SIGPLAN 1993Conference on Programming Language Design and Implementation (PLDI’93),1993．NY,USA:ACM New York,1993:126-138
[3] Ferner C S．The Paraguin Compiler-Message-Passing Code Generation Using SUIF[C]∥Proceedings IEEE SoutheastCon,2002．Washington DC:IEEE Computer Society,2002:1-6
[4] Ferner C S．Revisiting Communication Code Generation Algorithms for Message-passing Systems[J]．International Journal of Parallel,Emergent and Distributed Systems,2006,21(5):323-344
[5] Bondhugula U,Hartono A,Ramanujam J,et al．A practical automatic polyhedral parallelizer and locality optimizer[C]∥Proceedings of The ACM SIGPLAN 2008Conference on Programming Language Design and Implementation (PLDI’08),2008．NY,USA:ACM New York,2008:101-113
[6] Bondhugula U．Automatic distributed-memory parallelizationand code generation using the polyhedral framework[R]．IISc-CSA-TR-2011-3．Bangalore:Indian Institute of Science,2011
[7] Ponnusamy R,Saltz J,Choudhary A．Runtime-compilation techniques for data partitioning and communication schedule reuse[C]∥Proceedings of the 1993ACM/IEEE Conference on Supercomputing,1993．NY,USA:ACM New York,1993:361370
[8] Mukherjee S S,Sharma S D,Hill M D,et al．Efficient support for irregular applications on distributedmemory machines[C]∥Proceedings of the Fifth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming 1995．NY,USA:ACM New York,1995:68-79
[9] Guo Min-yi,Li Li,Chang Weng-long．Efficient loop partitioningfor parallel codes of irregular scientific computations[J]．IEICE Transactions on Information and Systems,2003,86(9):1825-1834
[10] Ravishankar M,Eisenlohr J,Pouchet LN,et al．Code generation for parallel execution of a class of irregular loops on distributed memory systems[C]∥ The International Conference for High Performance Computing,Networking,Storage,and Analysis (SC12),2012．Los Alamitos,CA,USA:IEEE Computer Society Press,2012
[11] Ravishankar M,Eisenlohr J,Pouchet LN,et al．Code generation for parallel execution of a class of irregular loops on distributed memory systems[R]．OSU-CISRC-5/12-TR10．The Ohio State University,2012
[12] Strout M M,George G,Olschanowsky C．Set and relation ma-nipulation for the sparse polyhedral framework[C]∥ Procee-dings of the 25^th International Workshop on Languages and Compilers for Parallel Computing (LCPC2012),2012．Springer-burg LNCS series,2013
[13] LaMielle A,Strout M．Enabling code generation with sparse polyhedral framework[R]．CS-10-102Colorado State University,2010
[14] Guo Min-yi,Pan Yi,Liu Zhen．Symbolic Communication SetGeneration for Irregular Parallel Applications[J]．The Journal of Supercomputing,2003,25(3):199-214
[15] 胡长军,李静,王珏,等．一类非规则并行应用问题的通信集生成算法[J]．计算机学报,2008,31(1):120-126
[16] Lee S,Min S J,Eigenmann R．OpenMP to GPGPU:a compiler framework for automatic translation and optimization[C]∥Proceedings of the 14^th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming(PPOPP’99),2009．NY,USA:ACM New York,2009:101-110
[17] Basumallik A,Eigenmann R．Optimizing irregular shared-memory applications for distributed-memory systems[C]∥Procee-dings of the 11th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPOPP’06),2006．NY,USA:ACM New York,2006:119-128
[18] Campanoni S,Jones T,Holloway G,et al．HELIX:AutomaticParallelization of Irregular Programs for Chip Multiprocessing[C]∥Proceedings of the 10^th International Symposium on Code Generation and Optimization (CGO’12),2012．NY,USA:ACM New York,2012:84-93
[19] Kim H,Johnson N P,Lee J W,et al．Automatic Speculative DOALL for Clusters[C]∥Proceedings of the 10th International Symposium on Code Generation and Optimization (CGO ’12),2012．NY,USA:ACM New York,2012:94-103
[20] Zhuang X,Eichenberger A E,Luo Y,et al．Exploiting parallelism with dependence-aware scheduling[C]∥Proceedings of the 2009International Conference on Parallel Architectures and Compilation Techniques (PACT’ 09),2009．Washington DC,USA:IEEE Computer Society,2009:193-202
[21] Lin Y,Padua D．Compiler analysis of irregular memory accesses[C]∥Proceedings of The ACM SIGPLAN 2000Conference on Programming Language Design and Implementation 2000．NY,USA:ACM New York,2000:157-168
[22] Zhao J,Zhao R C,Han L．A Nonlinear Array Subscripts De-pendence Test[C]∥Proceedings of the 14th IEEE International Conference on High Performance Computing and Communications．Liverpool,England,2012:764-771
[23] Aho A V,Lam M S,Sethi R,et al.Compilers principles,techniques,and tools(Second edition)[M]．Boston:Addison-Wesley,2007
[24] Amarasinghe S P,Lam M S．Communication optimization and code generation for distributed memory machine:[C]∥Procee-dings of The ACM SIGPLAN 1993Conference on Programming Language Design and Implementation (PLDI’93),1993．NY,USA:ACM New York,1993:126-138
[25] Mitchell N,Carter L,Ferrante J．Localizing Non-affine ArrayReferences[C]∥Proceedings of the 1999International Confe-rence on Parallel Architectures and Compilation Techniques (PACT’ 99),1999．DC,USA:IEEE Computer Society Wa-shington,1999:192-202
[26] 丁锐,赵荣彩,韩林．基于主导值的计算和数据划分算法[J]．计算机科学,2012,9(3):290-294

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Code Generation for Automatic Parallelization of Irregular Loops

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 0

Metrics

Comments

Recommended 0