计算机科学 ›› 2014, Vol. 41 ›› Issue (9): 28-31.doi: 10.11896/j.issn.1002-137X.2014.09.004
刘鹏,赵荣彩,赵博,高伟
LIU Peng,ZHAO Rong-cai,ZHAO Bo and GAO Wei
摘要: 随着多媒体应用的普及和高性能计算的需求,越来越多的处理器集成了SIMD扩展。为了针对不同SIMD扩展部件自动生成高效的向量化代码,设计了一套虚拟向量指令集,在此基础上构建了一种面向SIMD扩展部件的向量化统一架构。将输入程序通过向量识别等阶段转变为虚拟向量指令的中间表示,而后通过向量长度解虚拟化和指令集解虚拟化,将其转变为特定SIMD部件的向量指令集。在申威1600、DSP和Alpha上的实验结果表明:统一架构能够针对3种平台自动变换出高效的向量化代码,在DSP上的加速比要明显优于其它两种平台。
[1] Peleg A,Weiser U.MMX Technology Extension to the IntelArchitecture[J].IEEE/ACM International Symposium on Microarchitecture,1996,16(4):42-50 [2] Intel 64-ia-32-architectures-software-developer-manual [EB/OL].http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-deve-loper-manual-325462.pdf,September 2013 [3] Stewart J.An Investigation of SIMD instruction sets.School of Information Technology and Mathematical Sciences[R].University of Ballarat,2005 [4] Franchetti F,Kral S,Lorenz J,et al.Efficient utilization ofSIMD extensions[J].Proceedings of the IEEE,2005,93(2):409-425 [5] SC140 DSP Core Reference Manual.Freescale Semiconductor[EB/OL].http:// catch.freescale.com/files/dsp/doc/ref_manual/MNSC140CORE.pdf,2004 [6] Fridman J,Greenfield Z.The Tiger SHARC DSP Architecture[J].IEEE Micro,2000,20(1):66-76 [7] TMS320C6000 CPU and Instruction Set Reference Guild(Rev.F)[R].Texas Instruments Inc.2000 [8] Allen R,Kennedy K.Optimizing Compilers for Modern Archi-tectures--A Dependence-based Approach[M].US:Morgan Kaufmann Publishers,2001 [9] Larsen S,Amarasinghe S.Exploiting superword level parallelismwith multimedia instruction sets[C]∥ Proc of the ACM SIGPLAN Conference on Programming Language Design and Implementation.June 2000:145-156 [10] Kudriavtsev A,Kogge P.Generation of Permutations for SIMD Processors[C]∥PLDI,2006.Ottawa,Canada,2006 [11] Eichenberger A E,Wu Peng,O’brein K.Vectorization for simd architectures with alignment constraints[C]∥PLDI.June 2004 [12] Wu Peng,Eichenberger A E,Wang A.Efficient simd code generation for runtime alignment[C]∥CGO.March 2005 [13] Hiroaki T,Chi Y T,Sakanushi K,et al.Pack Instruction Generation for Media Processors Using Multi-valued Decision Diagram[C]∥CODES+ISSS.Seoul,Korea,October 2006:154-159 [14] Karrenberg R.Whole Function Vectorization[C]∥2011 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization.2011:141-150 [15] Zhu Jia-feng,Zhao Rong-cai,Han Lin,et al.A VectorizationMethod of Export Branch for SIMD Extension[C]∥2011 IEEE/ACIS 10th International Conference on Computer and Information Science (ICIS).2011:265-269 [16] Nuzman D,Rosen I,Zaks A.Auto-Vectorization of Interleaved Data for SIMD[C]∥PLDI.June 2006:132-143 [17] 魏帅,魏然,侯永生.面向科学计算程序的向量化[J].信息工程大学学报,2011(6):759-763,768 [18] Barik R,Zhao Ji-sheng,Sarkar V.Efficient Selection of Vector Instructions using Dynamic Programming[C]∥2010 43rd Annual IEEE/ACM International Symposium on Microarchitectures.2010 |
No related articles found! |
|