Computer Science ›› 2016, Vol. 43 ›› Issue (1): 226-231.doi: 10.11896/j.issn.1002-137X.2016.01.049
Previous Articles Next Articles
GAO Wei, ZHAO Rong-cai, YU Hai-ning and ZHANG Qing-hua
[1] Sarkar V.Optimized unrolling of nested loops[C]∥Proc.of the 14th Int’l Conf.on Supercomputing.New Mexico:ACM Press,2000 [2] Li W L,Liu L,Tang Z Z.Loop unrolling optimization for software pipelining[J].Journal of Beijing University of Aeronautics and Astronautics,2004,30(11):1111-1115(in Chinese)李文龙,刘利,汤志忠.软件流水中的循环展开优化[J].北京航空航天大学学报,2004,30(11):1111-1115 [3] Lam Y M,Coutinho J G F,Luk W,et al.Unrolling-based loop mapping and scheduling[C]∥2008 International Conference on Field-Programmable Technolugy(ICFPT’2008).Taipei,Taiwan,2008:321-324 [4] Callahan D,Kennedy K,Porterfield A.Software prefetching[C]∥Proc.of the 4th Int’l Conf.on Architectural Support for Programming Languages and Operating Systems.ACM Press,1991 [5] Zhu J H.Research on SIMD Compiling Optimization Techniques[D].Shanghai:Fudan University,2005(in Chinese)朱嘉华.SIMD编译优化技术研究[D].上海:复旦大学,2005 [6] Jiang W H,Mei C,Guo Y.Vectorization for Real-life MultimediaApplications on Processors’ Multimedia Extensions [J].Chinese Journal of Computers,2005,28(8):1254-1266(in Chinese) 姜伟华,梅超,郭一.一种针对多媒体扩展指令集和实际多媒体程序的自动向量化方法[J].计算机学报,2005,28(8):1254-1266 [7] Zhang W H,Zang B Y.Research on SIMD Compiling Optimization Techniques [J].Communications of the China Computer Federation(CCCF),2007,3(2):27-36(in Chinese)张为华,臧斌宇.SIMD编译优化技术研究概述[J].中国计算机学会通讯,2007,3(2):27-36 [8] Intel Corp.Intel C/C++ and Intel Fortran Compilers for Linux[EB/OL].http://www.intel.com/software/products/compilers [9] The GNU Compiler Collection[EB/OL].http://gcc.gnu.org [10] Pathscale Compiler User’s Guide[EB/OL].http://www.pathscale.com [11] http://sourceforge.net/projects/open64/files/open64/Open64-5.0 [12] Bacon D F,Graham S L,Shap O J.Compiler Transformations for High-Performance Computing[J].ACM Computing Surveys,1994,26(4):345-420 [13] Alexander M J,Bailey M W,Childers B R,et al.Memory bandwidth optimizations for wide-bus machines[C]∥Proceedings of the 26th Hawaii International Conference on System Sciences.Wailea,Hawaii,1993:466-475 [14] Mowry T C.Tolerating Latency Through Software-ControlledData Prefetching[D].Stanford University,March 1994 [15] Fog A.Optimizing subroutines in assembly language[D].Co-penhagen University College of Engineering,September 2012 [16] Ma Y,Carr S.Register Pressure Guided Unroll-and-Jam[M]∥The 2008 Open64 Workshop.Boston,MA,USA,April 6,2008 [17] Carr S,Guan Y.Unroll and Jam using Uniformly Generated Sets[C]∥Proceedings of the 30th Annual International Symposium on Microarchitecture (MICRO-30).1997:349-357 [18] Carr S,Kennedy K.Improving the Ratio of Memory Operations to Floating-Point Operations in Loops[J].ACM Transactions on Programming Languages and Systems (ToPLaS),1994,16(6):1768-1810 [19] Mark S,Saman A.Predicting Unroll Factors Using SupervisedClassification[C]∥Proceedings of the International Symposium on Code Generation and Optimization(CGO).2003:204-215 [20] Monsifrot A,Bodin F,Quiniou R.A Machine Learning Ap-proach to Automatic Production of Compiler Heuristics[M]∥Artificial Intelligence:Methodology,Systems,Applications.2002:41-50 [21] Mowry T C,Lam M S,Gupta A.Design and evaluation of a compiler algorithm for prefetching[C]∥Proceeding of the Fifth International Conference on Architectural Support for Programming Languages and Operating Systems.Massachusetts:ACM Press,1992:62-73 |
No related articles found! |
|