Computer Science ›› 2022, Vol. 49 ›› Issue (10): 44-51.doi: 10.11896/jsjkx.210800247
• High Perfonnance Computing • Previous Articles Next Articles
WANG Bo-yang1, PANG Jian-min1,2, XU Jin-long2, ZHAO Jie2, TAO Xiao-han2, ZHU Yu2
CLC Number:
[1]KANG H,KWON H C,KIM D.HPMaX:heterogeneous parallel matrix multiplication using CPUs and GPUs[J].Computing,2020,102(12):2607-2631. [2]HEMEIDA A M,HASSAN S A,ALKHALAF S,et al.Optimi-zing matrix-matrix multiplication on intel'sadvanced vector extensions multicore processor[J].Ain Shams Engineering Journal,2020,11(4):1179-1190. [3]HOSSEIN A,ASADOLLAH S.SIMD programming using Intelvector extensions - ScienceDirect[J].Journal of Parallel and Distributed Computing,2020,135:83-100. [4]FEAUTRIER P,LENGAUER C.Polyhedron model[J/OL].https://link.springer.com/referenceworkentry/10.1007/978-0-387-09766-4_502. [5]VERDOOLAEGE S,CARLOS JUEGA J,COHEN A,et al.Po-lyhedral parallel code generation for CUDA[J].ACM Transactions on Architecture and Code Optimization(TACO),2013,9(4):54:1-54:24. [6]BONDHUGULA U,HARTONO A,RAMANUJAM J,et al.A practical automatic polyhedral parallelizer and locality optimizer[C]//Proceedings of the 2008 ACM SIGPLAN Conference on Programming Language Design and Implementation(PLDI'08).ACM,2008. [7]GAO W,ZHAO R C,HAN L,et al.Overview of SIMD automa-tic vectorization compilation optimization [J].Journal of Software,2015,26(6):1265-1284. [8]EDAMATSU T,TAKAHASHI D.Accelerating Large Integer Multiplication Using Intel AVX-512IFMA[M].Algorithms and Architectures for Parallel Processing,2020:60-74. [9]STEPHENS N,BILES S,BOETTCHER M,et al.The ARMScalable Vector Extension[J].IEEE Micro,2017,37(2):26-39. [10]ZHAO J,LI Y Y,ZHAO R C.“Black magic” of polyhedral compilation[J].Journal of Software,2018,29(8):2371-2396. [11]VERDOOLAEGE S,GUELTON S,GROSSER T.ScheduleTrees[J].Geodinamica Acta,2013,25(1/2):86-95. [12]GROSSER T,VERDOOLAEGE S,COHEN A.Polyhedral AST generation is more than scanning polyhedra[J].Acm Transactions on Programming Languages & Systems,2015,37(4):1-50. [13]KONG M,VERAS R,STOCK K,et al.When polyhedral transformations meet SIMD code generation[C]// Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation.ACM,2013. [14]LENGAUER G C.Polly-Performing Polyhedral Optimizations on a Low-Level Intermediate Representation[J].Parallel Processing Letters,2012,22(4):14-53. [15]VERDOOLAEGE S,JANSSENS G.Scheduling for PPCG[EB/OL].https://www.researchgate.net/publication/317826152_Scheduling_for_PPCG. [16]LIU Z,TIAN X.Matrix multiplication vectorization method for multi-core vector processors[J].Chinese Journal of Computers,2018,41(10):2251-2264. [17]LIU G,ZHANG H,MAO R,et al.Optimization of DGEMMFunction for Loongson3B1500 Architecture[J].Small Microcomputer System,2014,35(7):1523-1527. |
[1] | CHEN Tao, SHU Hui, XIONG Xiao-bing. Study of Universal Shellcode Generation Technology [J]. Computer Science, 2021, 48(4): 288-294. |
[2] | HU Rong, YANG Wang-dong, WANG Hao-tian, LUO Hui-zhang, LI Ken-li. Parallel WMD Algorithm Based on GPU Acceleration [J]. Computer Science, 2021, 48(12): 24-28. |
[3] | HAN Xiao-dong, GAO Fei, ZHANG Li-wei. Novel Real-time Algorithm for Critical Path of Linear Network Coding [J]. Computer Science, 2020, 47(9): 232-237. |
[4] | YANG Ping, WANG Sheng-yuan. Analysis of Target Code Generation Mechanism of CompCert Compiler [J]. Computer Science, 2020, 47(9): 17-23. |
[5] | DING Rong, YU Qian-hui. Growth Framework of Autonomous Unmanned Systems Based on AADL [J]. Computer Science, 2020, 47(12): 87-92. |
[6] | YANG Fei, MA Yu-chun, HOU Jin and XU Ning. Research on Acceleration of Matrix Multiplication Based on Parallel Scheduling on MPSoC [J]. Computer Science, 2017, 44(8): 36-41. |
[7] | WU Wei-zu, LIU Li-qun and XIE Dong-qing. Vectorized Representation of Heterogeneous Network Based on Neural Networks [J]. Computer Science, 2017, 44(5): 272-275. |
[8] | GAO Wei, ZHAO Rong-cai, YU Hai-ning and ZHANG Qing-hua. Loop Unrolling in Vectorized Programs [J]. Computer Science, 2016, 43(1): 226-231. |
[9] | LI Peng-yuan, ZHAO Rong-cai, GAO Wei and ZHANG Qing-hua. Effective Vectorization Technique for Interleaved Data with Constant Strides [J]. Computer Science, 2015, 42(5): 194-199. |
[10] | YIN Meng-jia, XU Xian-bin, XIONG Zeng-gang and ZHANG Tao. Quantitative Performance Analysis Model of Matrix Multiplication Based on GPU [J]. Computer Science, 2015, 42(12): 13-17. |
[11] | XIANG Lin-hong,CHEN Yu-wen and ZHANG Yu-lin. High Order Matrix Multiplication by MapReduce Algorithm Based on Hadoop Platform [J]. Computer Science, 2013, 40(Z6): 96-98. |
[12] | ZHANG Li-yong CHEN Ping (Software Engineering Institute, Xidian University, Xi'an 710071, China ). [J]. Computer Science, 2008, 35(5): 284-287. |
|