Computer Science ›› 2025, Vol. 52 ›› Issue (5): 41-49.doi: 10.11896/jsjkx.241200053
• High Performance Computing • Previous Articles Next Articles
LIAO Qiucheng1, ZHOU Yang2, LIN Xinhua1
CLC Number:
[1]MCCALPIN J D.HPL and DGEMM Performance Variability on the Xeon Platinum 8160 Processor[C]//Proceedings of the International Conference for High Performance Computing,Networking,Storage and Analysis.Dallas:IEEE Press,2018:225-237. [2]CHUNDURI S,HARMS K,PARKER S,et al.Run-to-run variability on Xeon Phi based cray XC systems[C]//Proceedings of the International Conference for High Performance Computing,Networking,Storage and Analysis.New York:Association for Computing Machinery,2017:1-13. [3]COOK B,KURTH T,AUSTIN B,et al.Performance Variability on Xeon Phi[C]//International Conference on High Perfor-mance Computing.Hamburg:Springer International Publishing,2017:419-429. [4]BHATELE A,THIAGARAJAN J J,GROVES T,et al.TheCase of Performance Variability on Dragonfly-based Systems[C]//Proceedings 2020 IEEE 34th International Parallel and Distributed Processing Symposium(IPDPS).New Orleans:IEEE Press,2020:896-905. [5]BHATELE A,MOHROR K,LANGER S H,et al.There goesthe neighborhood:Performance degradation due to nearby jobs[C]//Proceedings of the International Conference on High Performance Computing,Networking,Storage and Analysis.Denver:Association for Computing Machinery,2013:1-12. [6]DAS R,MUTLU O,MOSCIBRODA T,et al.Aergia:exploiting packet latency slack in on-chip networks[J].ACM SIGARCH Computer Architecture News,2010,38(3):106-116. [7]RÖHL T,TREIBIG J,HAGER G,et al.Overhead Analysis ofPerformance Counter Measurements[C]//Proceedings of the 2014 43rd International Conference on Parallel Processing Workshops.Minneapolis:IEEE Computer Society,2014:176-185. [8]HOEFLER T,BELLI R.Scientific benchmarking of parallelcomputing systems:twelve ways to tell the masses when reporting performance results[C]//Proceedings of the International Conference for High Performance Computing,Networking,Storage and Analysis.Austin:Association for Computing Machinery,2015:1-12. [9]LIAO Q,LIN J.TacVar:Tackling Variability in Short-Interval Timing Measurements on X86 Processors[C]//2024 IEEE 24th International Symposium on Cluster,Cloud and Internet Computing(CCGrid 2024).Philadelphia:IEEE Computer Society,2024:496-506. [10]ZHAI J,ZHENG L,SUN J,et al.Leveraging Code Snippets to Detect Variations in the Performance of HPC Systems[J].IEEE Transactions on Parallel and Distributed Systems,2022,33(12):3558-3574. [11]HUNOLD S,CARPEN-AMARIE A,TRÄFF J L.Reproducible MPI Micro-Benchmarking Isn't as Easy as You Think[C]//Proceedings of the 21st European MPI Users' Group Meeting.New York,NY,USA:Association for Computing Machinery,2014:69-76. [12]HUNOLD S,CARPEN-AMARIE A.Reproducible MPI Benchmarking is Still Not as Easy as You Think[J].IEEE Transactions on Parallel and Distributed Systems,2016,27(12):3617-3630. [13]PAOLONI G.How to Benchmark Code Execution Times on Intel IA-32 and IA-64 Instruction Set Architectures[EB/OL].(2010-09-01) [2024-12-07].https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/ia-32-ia-64-benchmark-code-execution-paper.pdf. [14]KANTOROVICH L V.Mathematical Methods of Organizingand Planning Production[J].Management Science,1960,6(4):366-422. [15]LEONID NISONOVICH VASERSTEIN.Markov Processes over Denumerable Products of Spaces,Describing Large Systems of Automata[J].Problemy Peredachi Informatsii,1969,5(3):64-72. [16]TERPSTRA D,JAGODE H,YOU H,et al.Collecting Perfor-mance Data with PAPI-C[C]//Tools for High Performance Computing 2009.Berlin:Springer,2010:157-173. [17]TREIBIG J,HAGER G,WELLEIN G.LIKWID:A Lightweight Performance-Oriented Tool Suite for x86 Multicore Environments[C]//2010 39th International Conference on Parallel Processing Workshops.San Diego:IEEE,2010:207-216. [18]KNÜPFER A,RÖSSEL C,MEY D A,et al.Score-P:A Joint Performance Measurement Run-Time Infrastructure for Periscope,Scalasca,TAU,and Vampir[C]//Tools for High Performance Computing 2011.Berlin:Springer,2012:79-91. [19]ADHIANTO L,BANERJEE S,FAGAN M,et al.HPCTOOLKIT:Tools for Performance Analysis of Optimized Parallel Programs[J].Concurrency and Computation:Practice and Expe-rience,2010,22(6):685-701. [20]WEAVER V M,DONGARRA J.Can hardware PerformanceCounters be Trusted?[C]//2008 IEEE International Symposium on Workload Characterization.Seattle:IEEE,2008:141-150. [21]WEAVER V,DONGARRA J.Can Hardware PerformanceCounters Produce Expected,Deterministic Results? [EB/OL].(2010-12-01)[2024-12-07].https://icl.utk.edu/files/publications/2010/icl-utk-451-2010.pdf. [22]WEAVER V M,TERPSTRA D,MOORE S.Non-determinism and Overcount on Modern Hardware Performance Counter implementations[C]//2013 IEEE International Symposium on Performance Analysis of Systems and Software(ISPASS).Austin:IEEE,2013:215-224. [23]MCCALPIN J.Memory Bandwidth and Machine Balance inHigh Performance Computers[C]//IEEE Technical Committee on Computer Architecture Newsletter.1995:19-25. [24]CHEN T,GUO Q,TEMAM O,et al.Statistical Performance Comparisons of Computers[J].IEEE Transactions on Compu-ters,2015,64(5):1442-1455. [25]ABEL A,REINEKE J.nanoBench:A Low-Overhead Tool for Running Microbenchmarks on x86 Systems[C]//2020 IEEE International Symposium on Performance Analysis of Systems and Software(ISPASS).Boston:IEEE,2020:34-46. |
[1] | SUN Yueyue, FAN Limin. Error Analysis and Parameter Recommendations for Randomness Test Under Large Sample Conditions [J]. Computer Science, 2025, 52(5): 322-329. |
[2] | TAN Zhengyuan, ZHONG Jiaqing, CHEN Juan. AI+HPC:An Overview of Supercomputing System Software and Application Technology Development Driven by “AI+” [J]. Computer Science, 2025, 52(5): 1-10. |
[3] | GAO Yiqin, LUO Zhiyu, WANG Yichao, LIN Xinhua. Performance Evaluation and Optimization of Operating System for Domestic Supercomputer [J]. Computer Science, 2025, 52(5): 11-24. |
[4] | HUANG Chenxi, LI Jiahui, YAN Hui, ZHONG Ying, LU Yutong. Investigation on Load Balancing Strategies for Lattice Boltzmann Method with Local Grid Refinement [J]. Computer Science, 2025, 52(5): 101-108. |
[5] | ZHANG Manjing, HE Yulin, LI Xu, HUANG Zhexue. Distributed Two-stage Clustering Method Based on Node Sampling [J]. Computer Science, 2025, 52(2): 134-144. |
[6] | YAN Xiaoting, WANG Xiaoning, DONG Sheng, ZHAO Yining, XIAO Haili. Review on the Development and Application of Checkpointing Technology in High-performanceComputing [J]. Computer Science, 2024, 51(9): 1-14. |
[7] | CHEN Yiyang, WANG Xiaoning, YAN Xiaoting, LI Guanlong ZHAO Yining, LU Shasha, XIAO Haili. Study on High Performance Computing Container Checkpoint Technology Based on CRIU [J]. Computer Science, 2024, 51(9): 40-50. |
[8] | XU He, ZHOU Tao, LI Peng, QIN Fangfang, JI Yimu. LU Parallel Decomposition Optimization Algorithm Based on Kunpeng Processor [J]. Computer Science, 2024, 51(9): 51-58. |
[9] | DENG Hannian, ZHOU Jie, YANG Bo, YI Lili, FU Guang, ZHOU Peng. Modeling and Analysis of Implementation Process for Civil Aircraft Certification Test Flight Based on Stochastic Petri Net [J]. Computer Science, 2024, 51(6A): 230700050-6. |
[10] | ZHANG Tao, LIAO Bin, YU Jiong, LI Ming, SUN Ruina. Benchmarking and Analysis for Graph Neural Network Node Classification Task [J]. Computer Science, 2024, 51(4): 132-150. |
[11] | ZHONG Zhenyu, LIN Yongliang, WANG Haotian, LI Dongwen, SUN Yufei, ZHANG Yuzhi. Automatic Pipeline Parallel Training Framework for General-purpose Computing Devices [J]. Computer Science, 2024, 51(12): 129-136. |
[12] | LI Siyao, LI Shanglin, LUO Jingzhi. Parallel Computing of Reentry Vehicle Trajectory by Multiple Shooting Method Based onOPENMP [J]. Computer Science, 2024, 51(11A): 231000019-6. |
[13] | PENG Weidong, GUO Wei, WEI Lin. Reconfigurable Computing System for Parallel Implementation of SVM Training Based on FPGA [J]. Computer Science, 2024, 51(11A): 231100120-7. |
[14] | WANG Xiaozhong, ZHANG Zuyu. Multi Level Parallel Computing for SW26010 Discontinuous Galerkin Finite Element Algorithm [J]. Computer Science, 2024, 51(11A): 240700055-5. |
[15] | HE Weilong, SU Lingli, GUO Bingxuan, LI Maosen, HAO Yan. Research and Implementation of Dynamic Scene 3D Perception Technology Based on BinocularEstimation [J]. Computer Science, 2024, 51(11A): 240300045-8. |
|