Computer Science ›› 2022, Vol. 49 ›› Issue (11): 76-82.doi: 10.11896/jsjkx.211200252
• Computer Software • Previous Articles Next Articles
GAO Xiu-wu, HUANG Liang-ming, JIANG Jun
CLC Number:
[1]WULF W A,MCKEE S A.Hitting the memory wall:implications of the obvious [J].ACM Sigarch Computer Architecture News,1995,23(1):20-24. [2]NOWATZYK A,PONG F,SAULSBURY A.Missing the Memory Wall:The Case for Processor/Memory Integration [J].ACM Sigarch Computer Architecture News,1996,24(2):90-101. [3]DENNING P J.The Locality Principle [J].Communications of the ACM,2005,48(7):19-24. [4]PING L.Analysis and Development of the Locality Principle[J].Advances in Intelligent and Soft Computing,2012,133(7):211-214. [5]BRYANT R,O’HALLARON D.Computer systems:a pro-grammer’s perspective [M].Upper Saddle River:Prentice Hall,2003. [6]VENKATESAN R,KOZHIKKOTTU V J,SHARAD M,et al.Cache Design with Domain Wall Memory[J].IEEE Transactions on Computers,2016,65(4):1010-1024. [7]BAER J L,CHEN T F.An effective on-chip preloading scheme to reduce data access penalty [C]//IEEE Conference on Supercomputing.ACM,1991. [8]DONG Y S,LI C J.Mechanism and Capability of Data Prefet-ching in Intel©64 Architecture [J].Computer Science,2016,43(5):34-41. [9]TIMOTHY S A,JONES M.Software Prefetching for IndirectMemory Accesses[C]//2017 IEEE/ACM International Symposium on Code Generation and Optimization(CGO).ACM,2017. [10]WANG J H,LI J,LU D D,et al.Hardware prefetching mechanism based on double step data stream[J].Computer Enginee-ring,2019,45(6):115-118,126. [11]JALEEL A,THEOBALD K B,STEELY S C,et al.High performance cache replacement using re-reference interval prediction(RRIP)[C]//International Symposium on Computer Architecture.ACM,2010. [12]ZHUANG X T,LEE H.A hardware-based cache pollution filtering mechanism for aggressive prefetches[C]//2003 International Conference on Parallel Processing.IEEE,2003. [13]PALANCA S,PENTKOVSKI V,TSAI S,et al.Method and apparatus for implementing Nontemporal stores.U.S.Patent 6,205,520[P].2001. [14]SANDBERG A,EKLOV D,HAGERSTEN E.Reducing cache pollution through detection and elimination of Nontemporal memory accesses[C]//Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing,Networking,Storage and Analysis.IEEE Computer Society,2010:1-11. [15]Intel.Intel©64 and IA-32 Architectures Software Developer’s Manuals,Volume 2B:Instruction Set Reference [Z].September 2016. [16]ARM.ARM Architecture Reference Manual,ARMv8,for ARM-v8-Aarchitecture profile [Z].September 2016. [17]KRISHNAIYER R,KULTURSAY E,CHAWLA P,et al.Compiler-Based Data Prefetching and Streaming Nontemporal Store Generation for the Intel(R) Xeon Phi(TM) Coprocessor[C]//2013 IEEE International Symposium on Parallel & Distributed Processing,Workshops and Phd Forum.IEEE,2013:1576-1586. [18]Intel© C++ Compiler Classic Developer Guide and Reference [Z].Version 2021.1,December 2020. [19]Free Software Foundation,Inc.GCC,the GNU compiler collection [EB/OL].(2017-05-02).https://gcc.gnu.org/. [20]MILLER D W,III D.Performance analysis of disk cache write policies [J].Microprocessors& Micro-systems,1995,19(3):121-130. [21]SPEC CPU2006 [EB/OL].(2011-10-20).https://www.spec.org/cpu2006/Docs. [22]SPEC CPU2017 [EB/OL].(2021-04-07).https://www.spec.org/cpu2017/Docs. [23]MOWRY T C,LAM M S,GUPTA A.Design and Evaluation of a Compiler Algorithm for Prefetching[J/OL].Aplos,1992.https://dl.acm.org/doi/epdf/10.1145/143365.143488. |
[1] | LU Hao-song, HU Yong-hua, WANG Shu-ying, ZHOU Xin-lian, LI Hui-xiang. Study on Hybrid Resource Heuristic Loop Unrolling Factor Selection Method Based on Vector DSP [J]. Computer Science, 2022, 49(6A): 777-783. |
[2] | WANG Bo-yang, PANG Jian-min, XU Jin-long, ZHAO Jie, TAO Xiao-han, ZHU Yu. Matrix Multiplication Vector Code Generation Based on Polyhedron Model [J]. Computer Science, 2022, 49(10): 44-51. |
[3] | TANG Zhen, HU Yong-hua, LU Hao-song, WANG Shu-ying. Research on DSP Register Pairs Allocation Algorithm with Weak Assigning Constraints [J]. Computer Science, 2021, 48(6A): 587-595. |
[4] | CHEN Tao, SHU Hui, XIONG Xiao-bing. Study of Universal Shellcode Generation Technology [J]. Computer Science, 2021, 48(4): 288-294. |
[5] | HU Wei-fang, CHEN Yun, LI Ying-ying, SHANG Jian-dong. Loop Fusion Strategy Based on Data Reuse Analysis in Polyhedral Compilation [J]. Computer Science, 2021, 48(12): 49-58. |
[6] | YANG Ping, WANG Sheng-yuan. Analysis of Target Code Generation Mechanism of CompCert Compiler [J]. Computer Science, 2020, 47(9): 17-23. |
[7] | DING Rong, YU Qian-hui. Growth Framework of Autonomous Unmanned Systems Based on AADL [J]. Computer Science, 2020, 47(12): 87-92. |
[8] | LI Peng-yuan, ZHAO Rong-cai, GAO Wei and ZHANG Qing-hua. Effective Vectorization Technique for Interleaved Data with Constant Strides [J]. Computer Science, 2015, 42(5): 194-199. |
[9] | GE Hong-mei,XU Chao,CHEN Nian and LIAO Xi-mi. Low Power Optimization Method Oriented to Embedded System’s Bus [J]. Computer Science, 2013, 40(12): 31-36. |
[10] | JI Ying-hui,ZHANG Jian-dong,CAI Wei,CAI Hui-zhi. RapidIO User-level Communication Interface Realization Based on RDMA [J]. Computer Science, 2010, 37(6): 293-296. |
[11] | TIAN Zu-wei,SUN Guang. Research of Compiler Optimization Technology Based on Predicated Code [J]. Computer Science, 2010, 37(5): 130-133. |
[12] | ZHANG Li-yong CHEN Ping (Software Engineering Institute, Xidian University, Xi'an 710071, China ). [J]. Computer Science, 2008, 35(5): 284-287. |
[13] | TANG Wei, WU Cheng-Yong, ZHANG Zhao-Qing (Institute of Computing Technology,Chinese Academy of Sciences, Beijing 100080). [J]. Computer Science, 2006, 33(4): 250-252. |
[14] | . [J]. Computer Science, 2006, 33(2): 257-262. |
|