Computer Science ›› 2024, Vol. 51 ›› Issue (4): 56-66.doi: 10.11896/jsjkx.231000124
• High Performance Computing • Previous Articles Next Articles
DI Jianqiang1,2, YUAN Liang1, ZHANG Yunquan1, ZHANG Sijia2
CLC Number:
[1]YUAN L,ZHANG Y,GUO P,et al.Tessellating Stencils[C]//Proceedings of the International Conference for High Perfor-mance Computing,Networking,Storage and Analysis.2017:1-13. [2]YUAN L,HUANG S,ZHANG Y,et al.Tessellating star Stencils[C]//Proceedings of the 48th International Conference on Parallel Processing.2019:1-10. [3]YUAN L,CAO H,ZHANG Y,et al.Temporal vectorization for Stencils[C]//Proceedings of the International Conference for High Performance Computing,Networking,Storage and Analysis.2021:1-13. [4]LI K,YUAN L,ZHANG Y,et al.An efficient vectorizationscheme for Stencil computation[C]//2022 IEEE International Parallel and Distributed Processing Symposium(IPDPS).IEEE,2022:650-660. [5]LI K,YUAN L,ZHANG Y,et al.Reducing redundancy in data organization and arithmetic calculation for Stencil computations[C]//Proceedings of the International Conference for High Performance Computing,Networking,Storage and Analysis.2021:1-15. [6]YUAN L,DING C,SMITH W,et al.A relational theory of locality[J].ACM Transactions on Architecture and Code Optimization(TACO),2019,16(3):1-26. [7]YUAN L,DING C,DENNING P,et al.A measurement theory of locality[J].arXiv:1802.01254,2018. [8]YUAN L,XIAO J.SI on parallel system and algorithm optimization[J].CCF Transactions on High Performance Computing,2023,5(3):229-230. [9]HUANG M,MIELIKAINEN J,HUANG B,et al.Developmentof efficient GPU parallelization of WRF Yonsei University pla-netary boundary layer scheme[J].Geoscientific Model Development,2015,8(9):2977-2990. [10]MIELIKAINEN J,HUANG B,HUANG A.Optimizing weather and researchforecast(WRF) Thompson cloud microphysics on Intel Many Integrated Core(MIC)[C]//Satellite Data Compression,Communications,and Processing X.SPIE,2014,9124:182-193. [11]WANG S D.WRF mode transplantation and optimization based on “Shenwei 26010” heterogeneous many-core processor[D].Jinan:Shandong University,2020. [12]MALAKAR P,SAXENA V,GEORGE T,et al.Performanceevaluation and optimization of nested high resolution weather simulations[C]//Euro-Par 2012 Parallel Processing:18th International Conference.Berlin Heidelberg:Springer,2012:805-817. [13]HASHMI J M,CHU C H,CHAKRABORTY S,et al.FAL-CON-X:Zero-copy MPI derived datatype processing on modern CPU and GPU architectures[J].Journal of Parallel and Distri-buted Computing,2020,144:1-13. [14]HUANG J,WANG W,WANG Y,et al.Performance Evaluation and Optimization of the Weather Research and Forecasting(WRF) Model Based on Kunpeng 920[J].Applied Sciences,2023,13(17):9800. [15]SOBHANI N,DEL VENTO D,GILL D.Performance analysisand optimization of the Weather Research and Forecasting Mo-del(WRF) advection schemes[C]//Third Symp.on High Performance Computing for Weather,Water,and Climate.Seattle,WA,Amer.Meteor.Soc.2017,3. [16]MIELIKAINEN J,HUANG B,HUANG A H L.Optimizingzonal advection of the Advanced Research WRF(ARW) dyna-mics for Intel MIC[C]//High-Performance Computing in Remote Sensing IV.SPIE,2014,9247:162-172. [17]MIELIKAINEN J,HUANG B,HUANG A H L.Optimizingmeridional advection of the Advanced Research WRF(ARW) dynamics for Intel Xeon Phi coprocessor[C]//Satellite Data Compression,Communications,and Processing XI.SPIE,2015,9501:246-258. [18]AO Y,YANG C,WANG X,et al.26 pflops Stencil computations for atmospheric modeling on sunway taihulight[C]//2017 IEEE International Parallel and Distributed Processing Symposium(IPDPS).IEEE,2017:535-544. [19]XU K,SONG Z,CHAN Y,et al.Refactoring and optimizingWRF model on sunway taihulight[C]//Proceedings of the 48th International Conference on Parallel Processing.2019:1-10. [20]LI M,LIU Y,YANG H,et al.Automatic code generation and optimization of large-scale stencil computation on many-core processors[C]//Proceedings of the 50th International Confe-rence on Parallel Processing.2021:1-12. [21]ZHANG K,SU H,DOU Y.Multilevel parallelism optimization of stencil computations on SIMDlized NUMA architectures[J].The Journal of Supercomputing,2021,77(11):13584-13600. |
[1] | JI Ying-rui, YUAN Liang, ZHANG Yun-quan. Parallelization and Locality Optimization for Red-Black Gauss-Seidel Stencil [J]. Computer Science, 2022, 49(5): 363-370. |
[2] | QIAN Dong-wei, CUI Yang-guang, WEI Tong-quan. Secondary Modeling of Pollutant Concentration Prediction Based on Deep Neural Networks with Federal Learning [J]. Computer Science, 2022, 49(11A): 211200084-5. |
[3] | BAO Yi-kun, ZHANG Peng, XU Xiao-wen, MO Ze-yao. Prediction of Optimal Loop Tiling Size for stencil Computation Based on Neural Network Model [J]. Computer Science, 2022, 49(10): 18-26. |
[4] | HU Wei-fang, CHEN Yun, LI Ying-ying, SHANG Jian-dong. Loop Fusion Strategy Based on Data Reuse Analysis in Polyhedral Compilation [J]. Computer Science, 2021, 48(12): 49-58. |
[5] | WANG Yue-feng and WANG Xi-bo. Design of Local Scheduling Algorithm for Integrated Preemptive Scheduling Policy in Hadoop Cluster Environment [J]. Computer Science, 2017, 44(Z6): 567-570. |
[6] | TANG Hong-mei and ZHENG Gang. Design and Optimization on Virtual Desktop Infrastructure Based on KVM [J]. Computer Science, 2017, 44(Z6): 560-562. |
[7] | LI Hang-chen, QIN Xiao-lin and SHEN Yao. Load Balancing Strategy on MapReduce with Locality-aware [J]. Computer Science, 2015, 42(10): 50-56. |
[8] | CHU Ya,MA Ting-huai and ZHAO Li-cheng. Cloud Computing Resource Scheduling:Policy and Algorithm [J]. Computer Science, 2013, 40(11): 8-13. |
[9] | GU Yu ,ZHOU Liang , DING Qiu-lin. Research of Three-Queue Scheduling Algorithms Based on Priority [J]. Computer Science, 2011, 38(Z10): 253-256. |
[10] | . [J]. Computer Science, 2009, 36(1): 16-18. |
|