Computer Science ›› 2022, Vol. 49 ›› Issue (10): 18-26.doi: 10.11896/jsjkx.220100147
• High Perfonnance Computing • Previous Articles Next Articles
BAO Yi-kun1,2, ZHANG Peng2,3, XU Xiao-wen2,3, MO Ze-yao3,4
CLC Number:
[1]DATTA K.Auto-tuning stencil codes for cache-based multicore platforms[D].Berkeley:University of California,2009. [2] XU Y T,FU H H,GAN L,et al.Performance Optimization and Analysis for Different Stencil Kernels on Multi-Core and Many-Core Architectures[C]//2013 CCF National Annual Conference on High Performance Computing.2013:628-637. [3]LIU S,WU W G,ZHAO B,et al.Loop Tiling for Optimization of Locality and Parallelism[J].Journal of Computer Research and Development,2015,52(5):1160-1176. [4]WOLF M,LAM M.A data Locality Optimizing Algorithm[C]//Proceedings of the ACM SIGPLAN 1991 Conference on Programming Language Design and Implementation.1991:30-44. [5]LAM M S.The cache preformance and optimizations of Blocked Argorithm[J].ACM SIGPLAN Notices,1991(4):63-74. [6]MITCHELL N,HOGSTEDT K,CARTER L,et al.Quantifying the Multi-Level Nature of Tiling Interactions[J].International Journal of Parallel Programming,1998,26(6):641-670. [7]SARKAR V,MEGIDDO N.An analytical model for loop tiling and its solution[C]//2000 IEEE International Symposium on Performance Analysis of Systems and Software.ISPASS(Cat.No.00EX422).IEEE,2000:146-153. [8]HSU C H,KREMER U.A quantitative analysis of tile size selection algorithms[J].The Journal of Supercomputing,2004,27(3):279-294. [9]LIU J,ZHANG Y,DING W,et al.On-chip cache hierarchy-aware tile scheduling for multicore machines[C]//International Symposium on Code Generation and Optimization(CGO 2011).IEEE,2011:161-170. [10]MEHTA S,BEERAKA G,YEW P C.Tile size selection revisited[J].ACM Transactions on Architecture and Code Optimization(TACO),2013,10(4):1-27. [11]MEHTA S,GARG R,TRIVEDI N,et al.Turbotiling:Leveraging prefetching to boost performance of tiled codes[C]//Proceedings of the 2016 International Conference on Supercompu-ting.2016:1-12. [12]SATO Y,YUKI T,ENDO T.An autotuning framework forscalable execution of tiled code via iterative polyhedral compilation[J].ACM Transactions on Architecture and Code Optimization(TACO),2019,15(4):1-23. [13]ZHU Y,PANG J M,XU J L,et al.Adaptive Tiling Size Algo-rithm for 3D Stencil Computation on SW26010 Many-core Processor[J].Journal of Computer Science,2021,48(6):10-18. [14]KISUKI T,KNIJNENBURG P M W,O'BOYLE M F P.Combined selection of tile sizes and unroll factors using iterative compilation[C]//Proceedings 2000 International Conference on Parallel Architectures and Compilation Techniques(Cat.No.PR00622).IEEE,2000:237-246. [15]WHALEY R C,PETITET A,DONGARRA J J.Automated em-pirical optimizations of software and the ATLAS project[J].Parallel Computing,2001,27(1/2):3-35. [16]KNIJNENBURG P M W,KISUKI T,GALLIVAN K,et al.The effect of cache models on iterative compilation for combined ti-ling and unrolling[J].Concurrency and Computation:Practice and Experience,2004,16(2/3):247-270. [17]CHEN C,CHAME J,HALL M.Combining models and guided empirical search to optimize for multiple levels of the memory hierarchy[C]//International Symposium on Code Generation and Optimization.IEEE,2005:111-122. [18]YOTOV K,PINGALI K,STODGHILL P.Think globally,search locally[C]//Proceedings of the 19th Annual Interna-tional Conference on Supercomputing.2005:141-150. [19]TIWARI A,CHEN C,CHAME J,et al.A scalable auto-tuning framework for compiler optimization[C]//2009 IEEE International Symposium on Parallel & Distributed Processing.IEEE,2009:1-12. [20]GOTO K,VAN DE GEIJN R.High-performance implementa-tion of the level-3 BLAS[J].ACM Transactions on Mathematical Software(TOMS),2008,35(1):1-14. [21]SHIRAKO J,SHARMA K,FAUZIA N,et al.Analytical bounds for optimal tile size selection[C]//International Conference on Compiler Construction.Berlin:Springer,2012:101-121. [22]YUKI T,RENGANARAYANAN L,RAJOPADHYE S,et al.Automatic creation of tile size selection models[C]//Procee-dings of the 8th Annual IEEE/ACM International Symposium on Code Generation and Optimization.2010:190-199. [23]MALIK A M.Optimal tile size selection problem using machine learning[C]//2012 11th International Conference on Machine Learning and Applications.IEEE,2012:275-280. [24]ZHANG P,FANG J,YANG C,et al.Optimizing streaming para-llelism on heterogeneous many-core architectures[J].IEEE Transactions on Parallel and Distributed Systems,2020,31(8):1878-1896. [25]LIU S,CUI Y,JIANG Q,et al.An efficient tile size selection model based on machine learning[J].Journal of Parallel and Distributed Computing,2018,121:27-41. [26]CHI H Y,CHEN C B.Prediction of Loop Tiling Size Based on Neural Network[J].Journal of Computer Science,2020:47(8):62-70. [27]KAMIL S,HUSBANDS P,OLIKER L,et al.Impact of modern memory subsystems on cache optimizations for stencil computations[C]//Proceedings of the 2005 Workshop on Memory System Performance.2005:36-43. [28]BORCHANI H,VARANDO G,BIELZA C,et al.A survey on multi-output regression[J].Wiley Interdisciplinary Reviews:Data Mining and Knowledge Discovery,2015,5(5):216-233. |
[1] | LENG Dian-dian, DU Peng, CHEN Jian-ting, XIANG Yang. Automated Container Terminal Oriented Travel Time Estimation of AGV [J]. Computer Science, 2022, 49(9): 208-214. |
[2] | NING Han-yang, MA Miao, YANG Bo, LIU Shi-chang. Research Progress and Analysis on Intelligent Cryptology [J]. Computer Science, 2022, 49(9): 288-296. |
[3] | HE Qiang, YIN Zhen-yu, HUANG Min, WANG Xing-wei, WANG Yuan-tian, CUI Shuo, ZHAO Yong. Survey of Influence Analysis of Evolutionary Network Based on Big Data [J]. Computer Science, 2022, 49(8): 1-11. |
[4] | LI Yao, LI Tao, LI Qi-fan, LIANG Jia-rui, Ibegbu Nnamdi JULIAN, CHEN Jun-jie, GUO Hao. Construction and Multi-feature Fusion Classification Research Based on Multi-scale Sparse Brain Functional Hyper-network [J]. Computer Science, 2022, 49(8): 257-266. |
[5] | ZHANG Guang-hua, GAO Tian-jiao, CHEN Zhen-guo, YU Nai-wen. Study on Malware Classification Based on N-Gram Static Analysis Technology [J]. Computer Science, 2022, 49(8): 336-343. |
[6] | CHEN Ming-xin, ZHANG Jun-bo, LI Tian-rui. Survey on Attacks and Defenses in Federated Learning [J]. Computer Science, 2022, 49(7): 310-323. |
[7] | XIAO Zhi-hong, HAN Ye-tong, ZOU Yong-pan. Study on Activity Recognition Based on Multi-source Data and Logical Reasoning [J]. Computer Science, 2022, 49(6A): 397-406. |
[8] | YAO Ye, ZHU Yi-an, QIAN Liang, JIA Yao, ZHANG Li-xiang, LIU Rui-liang. Android Malware Detection Method Based on Heterogeneous Model Fusion [J]. Computer Science, 2022, 49(6A): 508-515. |
[9] | WANG Fei, HUANG Tao, YANG Ye. Study on Machine Learning Algorithms for Life Prediction of IGBT Devices Based on Stacking Multi-model Fusion [J]. Computer Science, 2022, 49(6A): 784-789. |
[10] | LI Ya-ru, ZHANG Yu-lai, WANG Jia-chen. Survey on Bayesian Optimization Methods for Hyper-parameter Tuning [J]. Computer Science, 2022, 49(6A): 86-92. |
[11] | ZHAO Lu, YUAN Li-ming, HAO Kun. Review of Multi-instance Learning Algorithms [J]. Computer Science, 2022, 49(6A): 93-99. |
[12] | XU Jie, ZHU Yu-kun, XING Chun-xiao. Application of Machine Learning in Financial Asset Pricing:A Review [J]. Computer Science, 2022, 49(6): 276-286. |
[13] | JI Ying-rui, YUAN Liang, ZHANG Yun-quan. Parallelization and Locality Optimization for Red-Black Gauss-Seidel Stencil [J]. Computer Science, 2022, 49(5): 363-370. |
[14] | LI Ye, CHEN Song-can. Physics-informed Neural Networks:Recent Advances and Prospects [J]. Computer Science, 2022, 49(4): 254-262. |
[15] | YAO Xiao-ming, DING Shi-chang, ZHAO Tao, HUANG Hong, LUO Jar-der, FU Xiao-ming. Big Data-driven Based Socioeconomic Status Analysis:A Survey [J]. Computer Science, 2022, 49(4): 80-87. |
|