计算机科学 ›› 2016, Vol. 43 ›› Issue (5): 34-41.doi: 10.11896/j.issn.1002-137X.2016.05.006

• 目次 • 上一篇    下一篇

Intel64体系结构的数据预取机制及效果

董钰山,李春江   

  1. 国防科学技术大学计算机学院软件研究所 长沙410073,国防科学技术大学计算机学院软件研究所 长沙410073
  • 出版日期:2018-12-01 发布日期:2018-12-01
  • 基金资助:
    本文受国家自然科学基金项目:多核多线程处理器SIMD扩展的编程模型及编译优化关键技术研究(61170046), 863计划项目:面向国产飞腾处理器的并行程序综合优化系统(2012AA010903)资助

Mechanism and Capability of Data Prefetching in Intel64 Architecture

DONG Yu-shan and LI Chun-jiang   

  • Online:2018-12-01 Published:2018-12-01

摘要: 数据预取是为缓解微处理器与DRAM之间速度差异而出现的隐藏访存延迟的方法。当前Intel各系列处理器都采用多种预取机制来加速数据和代码向Cache的移动,从而提升程序的性能。通过对Intel64体系结构存储层次的分析,剖析了X86/X64体系的数据预取机制,包括硬件预取和软件预取,并且分析了编译器对软件预取机制的支持。最后测试了Intel64体系结构数据预取对科学计算程序中紧嵌套循环性能的影响,总结出了影响数据预取有效性的几个因素。此项工作对在Intel平台上进行循环数组预取优化有指导意义。

关键词: Intel 64,Cache,硬件预取,软件预取,GCC,ICC

Abstract: Data prefetching is an approach to reducing cache miss latencies,which can appropriately fill the speed gap between the microprocessor and DRAM.Recently,Intel processor families employ several prefetching mechanisms to accelerate the movement of data or code to Cache,and improve performance.By a brief analysis of the memory hierarchy of Intel64 architecture,data prefetching mechanism of X86/X64 architecture,including hardware prefetching and software prefetching,was deeply dissected,and then the compiler support for software prefetching mechanism was analyzed.After testing the performance of data prefetcher of Intel64 architecture for nested loop,we concluded several factors affecting the effect of data prefetching.These works provide a valuable contribution for the research and deve-lopment of the loop-array-prefetching optimization on the Intel platform.

Key words: Intel 64,Cache,Hardware prefetching,Software prefetching,GCC,ICC

[1] Hennessy J L,Patterson D A.Computer architecture:a quantitative approach [M].Elsevier,2012
[2] Sailing.浅谈Cache Memory[EB/OL].(2011-10-03)[2015-3-17].http://blog.sina.com.cn/s/blog_6472c4cc0102dw61.html
[3] Intel Corporation.Intel 64 and IA-32 Architectures Optimization Reference Manual [EB/OL].[2015-03-05].http://www.intel.com/content/www/ us/en/processors/architectures-software-developer-manuals.html
[4] Intel Corporation.Intel 64 and IA-32 Architectures Software Developer’s Manual Volume 1:Basic Architecture [EB/OL].[2015-03-05].http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html
[5] Intel Corporation.Intel 64 and IA-32 Architectures SoftwareDeveloper’s Manual Documentation Changes [EB/OL].[2015-03-05].http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html
[6] Intel Corporation.Intel Instruction Set Architecture Extensions [EB/OL].[2014-12-31].https://software.intel.com/en-us/intel-isa-extensions
[7] Free Software Foundation,Inc.GCC,the GNU Compiler Collection [EB/OL].(2014-12-23)[2015-03-05].https://gcc.gnu.org
[8] Intel Corporation.Intel Parallel Studio XE 2015 ComposerEdition C++Release Notes [EB/OL].(2014-06-25)[2015-03-05].https://software.intel.com/en-us/articles/intel-parallel-studio-xe-2015-composer-edition-c-release-notes
[9] Intel Corporation.Intel Xeon Processor E5-1600/E5-2600/E5-46 00 Product Families Datasheet Volume One [EB/OL].[2015-03-05].http://www.intel.com/products/processor%5Fnumber/
[10] Intel Corporation.An Introduction to the Intel QuickPath Interconnect[EB/OL].[2009-01-30].http://www.intel.com
[11] 王恩东,等.MIC 高性能计算编程指南[M].北京:中国水利水电出版社,2012
[12] Jeffers J,Reinders J.Intel Xeon Phi coprocessor high performance programming[M].Newnes,2013
[13] Intel Corporation.Intel 64 and IA-32 Architectures SoftwareDeveloper’s Manual Volume2(2A,2B & 2C):Instruction Set Reference,A-Z [EB/OL].[2015-03-05].http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html
[14] Intel Corporation.Intel C++Compiler User and Reference Guides [EB/OL].[2015-03-05].http://www.intel.com
[15] Free Software Foundation.Inc.GCC 4.9 Release Series[EB/OL].[2014-07-16].http://gcc.gnu.org/gcc-4.9/
[16] Manchanda N,Anand K.Non-Uniform Memory Access(NUMA).http://cs.nyu.edu/~lerner/spring10/ projects/NUMA.pdf
[17] Intel Corporation.Intel 64 and IA-32 Architectures SoftwareDeveloper’s Manual Volume 3(3A,3B & 3C):System Programming Guide[EB/OL].[2015-03-05].http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html
[18] Feng Q Y.Research on Data Prefetching Techniques for Loop-Level Array References[D].Changsha:National University of Defense Technology,2008(in Chinese) 冯权友.面向循环级数组访问的数据预取技术研究[D].长沙:国防科学技术大学,2008
[19] Igor Ostrovsky Blogging.Gallery of Processor Cache Effects[EB/OL].http://igoro.com/archive/gallery-of-processor-cache-effects

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!