计算机科学 ›› 2016, Vol. 43 ›› Issue (5): 34-41.doi: 10.11896/j.issn.1002-137X.2016.05.006
董钰山,李春江
DONG Yu-shan and LI Chun-jiang
摘要: 数据预取是为缓解微处理器与DRAM之间速度差异而出现的隐藏访存延迟的方法。当前Intel各系列处理器都采用多种预取机制来加速数据和代码向Cache的移动,从而提升程序的性能。通过对Intel64体系结构存储层次的分析,剖析了X86/X64体系的数据预取机制,包括硬件预取和软件预取,并且分析了编译器对软件预取机制的支持。最后测试了Intel64体系结构数据预取对科学计算程序中紧嵌套循环性能的影响,总结出了影响数据预取有效性的几个因素。此项工作对在Intel平台上进行循环数组预取优化有指导意义。
[1] Hennessy J L,Patterson D A.Computer architecture:a quantitative approach [M].Elsevier,2012 [2] Sailing.浅谈Cache Memory[EB/OL].(2011-10-03)[2015-3-17].http://blog.sina.com.cn/s/blog_6472c4cc0102dw61.html [3] Intel Corporation.Intel 64 and IA-32 Architectures Optimization Reference Manual [EB/OL].[2015-03-05].http://www.intel.com/content/www/ us/en/processors/architectures-software-developer-manuals.html [4] Intel Corporation.Intel 64 and IA-32 Architectures Software Developer’s Manual Volume 1:Basic Architecture [EB/OL].[2015-03-05].http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html [5] Intel Corporation.Intel 64 and IA-32 Architectures SoftwareDeveloper’s Manual Documentation Changes [EB/OL].[2015-03-05].http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html [6] Intel Corporation.Intel Instruction Set Architecture Extensions [EB/OL].[2014-12-31].https://software.intel.com/en-us/intel-isa-extensions [7] Free Software Foundation,Inc.GCC,the GNU Compiler Collection [EB/OL].(2014-12-23)[2015-03-05].https://gcc.gnu.org [8] Intel Corporation.Intel Parallel Studio XE 2015 ComposerEdition C++Release Notes [EB/OL].(2014-06-25)[2015-03-05].https://software.intel.com/en-us/articles/intel-parallel-studio-xe-2015-composer-edition-c-release-notes [9] Intel Corporation.Intel Xeon Processor E5-1600/E5-2600/E5-46 00 Product Families Datasheet Volume One [EB/OL].[2015-03-05].http://www.intel.com/products/processor%5Fnumber/ [10] Intel Corporation.An Introduction to the Intel QuickPath Interconnect[EB/OL].[2009-01-30].http://www.intel.com [11] 王恩东,等.MIC 高性能计算编程指南[M].北京:中国水利水电出版社,2012 [12] Jeffers J,Reinders J.Intel Xeon Phi coprocessor high performance programming[M].Newnes,2013 [13] Intel Corporation.Intel 64 and IA-32 Architectures SoftwareDeveloper’s Manual Volume2(2A,2B & 2C):Instruction Set Reference,A-Z [EB/OL].[2015-03-05].http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html [14] Intel Corporation.Intel C++Compiler User and Reference Guides [EB/OL].[2015-03-05].http://www.intel.com [15] Free Software Foundation.Inc.GCC 4.9 Release Series[EB/OL].[2014-07-16].http://gcc.gnu.org/gcc-4.9/ [16] Manchanda N,Anand K.Non-Uniform Memory Access(NUMA).http://cs.nyu.edu/~lerner/spring10/ projects/NUMA.pdf [17] Intel Corporation.Intel 64 and IA-32 Architectures SoftwareDeveloper’s Manual Volume 3(3A,3B & 3C):System Programming Guide[EB/OL].[2015-03-05].http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html [18] Feng Q Y.Research on Data Prefetching Techniques for Loop-Level Array References[D].Changsha:National University of Defense Technology,2008(in Chinese) 冯权友.面向循环级数组访问的数据预取技术研究[D].长沙:国防科学技术大学,2008 [19] Igor Ostrovsky Blogging.Gallery of Processor Cache Effects[EB/OL].http://igoro.com/archive/gallery-of-processor-cache-effects |
No related articles found! |
|