Computer Science ›› 2025, Vol. 52 ›› Issue (5): 11-24.doi: 10.11896/jsjkx.240500103
• High Performance Computing • Previous Articles Next Articles
GAO Yiqin, LUO Zhiyu, WANG Yichao, LIN Xinhua
CLC Number:
[1]Home-|TOP500[EB/OL].[2024-05-20].https://www.top500.org/. [2]CentOS Project shifts focus to CentOS Stream - Blog.CentOS.org[EB/OL].[2024-05-20].https://blog.centos.org/2020/12/future-is-centos-stream/. [3]NI G.The essence of information security is autonomous and controllable[J].China Economy & Informatization,2013(5):18-19. [4]ZHOU M,HU X,XIONG W.openEuler:Advancing a Hardware and Software Application Ecosystem[J].IEEE Software,2022,39(2):101-105. [5]WANG R,WANG Q,HU Y,et al.Industry practice of configuration auto-tuning for cloud applications and services[C]//Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering.New York,NY,USA:Association for Computing Machinery,2022:1555-1565. [6]UOS Community[EB/OL].[2024-05-20].https://www.chi-nauos.com/. [7]GitHub - HPC-SJTU/Performance_Evaluation_of_openEuler at dev[EB/OL].[2024-05-20].https://github.com/HPC-SJTU/Performance_Evaluation_of_openEuler/tree/dev. [8]GEROFI B,ISHIKAWA Y,RIESEN R,et al.Operating Systems for Supercomputers and High Performance Computing[M].Singapore:Springer Singapore,2019. [9]RIESEN R,WHEAT S R,MACCABE A B.Active messagesversus explicit message passing under SUNMOS:SAND-94-1582C;CONF-9406205-3[R].Sandia National Labs.,Albuquerque,NM(United States),1994. [10]SHIMIZU M,UKAI T,SANPEI H,et al.HSFS:Hitachi striping file system for super technical server SR11000[C]//Forum on Information Technology Letters.2005. [11]YOSHII K,ISKRA K,NAIK H,et al.Performance and Scalability Evaluation of ‘Big Memory' on Blue Gene Linux[J].The International Journal of High Performance Computing Applications,2011,25(2):148-160. [12]NEC SX-Aurora TSUBASA[EB/OL].[2024-05-20].https://www.nec.com/en/global/solutions/hpc/sx/index.html. [13]GIAMPAPA M,GOODING T,INGLETT T,et al.Experiences with a Lightweight Supercomputer Kernel:Lessons Learned from Blue Gene's CNK[C]//Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing.Networking,Storage and Analysis.2010:1-10. [14]GSCHWIND M.Blue Gene/Q:design for sustained multi-petaflop computing[C]//Proceedings of the 26th ACM International Conference on Supercomputing.New York,NY,USA:Association for Computing Machinery,2012:245-246. [15]GEROFI B,TARUMIZU K,ZHANG L,et al.Linux vs.lightweight multi-kernels for high performance computing:experiences at pre-exascale[C]//Proceedings of the International Conference for High Performance Computing,Networking,Storage and Analysis.New York,NY,USA:Association for Computing Machinery,2021:1-13. [16]SHIMOSAWA T,GEROFI B,TAKAGI M,et al.Interface for heterogeneous kernels:A framework to enable hybrid OS designs targeting high performance computing on manycore architectures[C]//2014 21st International Conference on High Performance Computing(HiPC).2014:1-10. [17]WISNIEWSKI R W,INGLETT T,KEPPEL P,et al.mOS:an architecture for extreme-scale operating systems[C]//Procee-dings of the 4th International Workshop on Runtime and Opera-ting Systems for Supercomputers.New York,NY,USA:Association for Computing Machinery,2014:1-8. [18]KLIMIANKOU Y.Towards practical multikernel OSes withMySyS[C]//Proceedings of the 13th ACM SIGOPS Asia-Pacific Workshop on Systems.New York,NY,USA:Association for Computing Machinery,2022:29-37. [19]Rocky Linux[EB/OL].[2024-05-20].https://rockylinux.org/. [20]AlmaLinux OS - Forever-Free Enterprise-Grade Operating System[EB/OL].[2024-05-20].https://almalinux.org/. [21]LI J Q,LIAO X K,MA J.A Typical Commercial Application for Kylin Operating System[C]//CSMA 2017.2017. [22]About Anolis OS 8[EB/OL].[2024-05-20].https://openanolis.cn/anolisos/. [23]Advantech has completed product compatibility mutual certification with UOS and Kirin OS based on Zhaoxin platform industrial motherboard[J].Microcontrollers & Embedded Systems,2022,22(11):96. [24]GROPP W,LUSK E,DOSS N,et al.A high-performance,portable implementation of the MPI message passing interface standard[J].Parallel Computing,1996,22(6):789-828. [25]LIU J,JIANG W,WYCKOFF P,et al.Design and implementation of MPICH2 over InfiniBand with RDMA support[C]//18th International Parallel and Distributed Processing Symposium,2004.Proceedings.2004. [26]CHEN S,HE W,QI F,et al.Hybrid Approach to Optimize MPI Collectives by In-network-computation and Point-to-Point Messages[C]//2022 7th International Conference on Computer and Communication Systems(ICCCS).2022:773-783. [27]MOODY A,FERNANDEZ J,PETRINI F,et al.Scalable NIC-based Reduction on Large-scale Clusters[C]//Proceedings of the 2003 ACM/IEEE conference on Supercomputing.New York,NY,USA:Association for Computing Machinery,2003. [28]PETRINI F,COLL S,FRACHTENBERG E,et al.Hardware- and software-based collective communication on the Quadrics network[C]//Proceedings IEEE International Symposium on Network Computing and Applications.NCA 2001.2001:24-35. [29]ALMASI G,ARCHER C,CASTANOS J G,et al.Design andimplementation of message-passing services for the Blue Gene/L supercomputer[J].IBM Journal of Research and Development,2005,49(2/3):393-406. [30]WILKINS M,GUO Y,THAKUR R,et al.ACCLAiM:Advancing the Practicality of MPI Collective Communication Autotuning Using Machine Learning[C]//2022 IEEE International Conference on Cluster Computing(CLUSTER).2022:161-171. [31]HASANOV K,LASTOVETSKY A.Hierarchical redesign ofclassic MPI reduction algorithms[J].The Journal of Supercomputing,2017,73(2):713-725. [32]NURIYEV E,RICO-GALLEGO J A,LASTOVETSKY A.Model-based selection of optimal MPI broadcast algorithms for multi-core clusters[J].Journal of Parallel and Distributed Computing,2022,165:1-16. [33]DONGARRA J J,LUSZCZEK P,PETITET A.The LINPACK Benchmark:past,present and future[J].Concurrency and Computation:Practice and Experience,2003,15(9):803-820. [34]First Experiences in Performance Benchmarking with the NewSPEChpc 2021 Suites [J].arXiv:2203.06751,2021. [35]SJTU HPC[EB/OL].[2024-05-20].https://hpc.sjtu.edu.cn/Item/Hardware.htm. [36]XIA J,CHENG C,ZHOU X,et al.Kunpeng 920:The First 7-nm Chiplet-Based 64-Core ARM SoC for Cloud Services[J].IEEE Micro,2021,41(5):67-75. [37]LANKES S,PICKARTZ S,BREITBART J.HermitCore:AUnikernel for Extreme Scale Computing[C]//Proceedings of the 6th International Workshop on Runtime and Operating Systems for Supercomputers.New York,NY,USA:Association for Computing Machinery,2016:1-8. [38]GEROFI B,RIESEN R,TAKAGI M,et al.Performance andScalability of Lightweight Multi-kernel Based Operating Systems[C]//2018 IEEE International Parallel and Distributed Processing Symposium(IPDPS).2018:116-125. [39]CHA S J,JEON S H,JEONG Y J,et al.OS noise Analysis on Azalea-unikernel[C]//2022 24th International Conference on Advanced Communication Technology(ICACT).2022:81-84. [40]XU H,HU Y,TAN B,et al.Fault Injection based Failure Analysis of three CentOS-like Operating Systems[J].arXiv:2210.08728,2023. [41]CHUNDURI S,PARKER S,BALAJI P,et al.Characterization of MPI Usage on a Production Supercomputer[C]//SC18:International Conference for High Performance Computing,Networking,Storage and Analysis.2018:386-400. [42]HOCKNEY R W.The communication challenge for MPP:IntelParagon and Meiko CS-2[J].Parallel Computing,1994,20(3):389-398. [43]KARP R M,SAHAY A,SANTOS E E,et al.Optimal broadcast and summation in the LogP model[C]//Proceedings of the Fifth Annual ACM Symposium on Parallel Algorithms andArchite-ctures.New York,NY,USA:Association for Computing Machi-nery,1993:142-153. [44]ALEXANDROV A,IONESCU M F,SCHAUSER K E,et al.LogGP:incorporating long messages into the LogP model-one step closer towards a realistic model for parallel computation[C]//Proceedings of the Seventh Annual ACM Symposium on Parallel Algorithms and Architectures.New York,NY,USA:Association for Computing Machinery,1995:95-105. [45]KIELMANN T,BAL H E,VERSTOEP K.Fast Measurement of LogP Parameters for Message Passing Platforms[C]//Parallel and Distributed Processing.Berlin,Heidelberg:Springer,2000:1176-1183. [46]RICO-GALLEGO J A,DÍAZ-MARTÍN J C.τ-Lop:Modelingperformance of shared memory MPI[J].Parallel Computing,2015,46:14-31. [47]THAKUR R,RABENSEIFNER R,GROPP W.Optimization of Collective Communication Operations in MPICH[J].The International Journal of High Performance Computing Applications,2005,19(1):49-66. [48]CHAN E,HEIMLICH M,PURKAYASTHA A,et al.Collec-tive communication:theory,practice,and experience[J].Concurrency and Computation:Practice and Experience,2007,19(13):1749-1783. [49]RABENSEIFNER R,TRÄFF J L.More Efficient Reduction Algorithms for Non-Power-of-Two Number of Processors in Message-Passing Parallel Systems[C]//Recent Advances in Parallel Virtual Machine and Message Passing Interface.Berlin,Heidelberg:Springer,2004:36-46. [50]CULLER D,KARP R,PATTERSON D,et al.LogP:towards a realistic model of parallel computation[J].ACM SIGPLAN Notices,1993,28(7):1-12. [51]VADHIYAR S S,FAGG G E,DONGARRA J.AutomaticallyTuned Collective Communications[C]//Proceedings of the 2000 ACM/IEEE Conference on Supercomputing.2000. [52]THAKUR R,GROPP W D.Improving the Performance of Collective Operations in MPICH[C]//Recent Advances in Parallel Virtual Machine and Message Passing Interface.Berlin,Heidelberg:Springer,2003:257-267. [53]PJEIVAC-GRBOVIĆ J,BOSILCA G,FAGG G E,et al.Decision Trees and MPI Collective Algorithm Selection Problem[C]//Euro-Par 2007 Parallel Processing.Berlin,Heidelberg:Springer,2007:107-117. [54]QUINLAN J R.C4.5:Programs for Machine Learning[M].Elsevier,2014. [55]HUNOLD S,STEINER S.OMPICollTune:Autotuning MPICollectives by Incremental Online Learning[C]//2022 IEEE/ACM International Workshop on Performance Modeling,Benchmarking and Simulation of High Performance Computer Systems(PMBS).2022:123-128. [56]HUNOLD S,BHATELE A,BOSILCA G,et al.Predicting MPI Collective Communication Performance Using Machine Learning[C]//2020 IEEE International Conference on Cluster Computing(CLUSTER).2020:259-269. |
[1] | LIAO Qiucheng, ZHOU Yang, LIN Xinhua. Metrics and Tools for Evaluating the Deviation in Parallel Timing [J]. Computer Science, 2025, 52(5): 41-49. |
[2] | SHANG Qiuyan, LI Yicong, WEN Ruilin, MA Yinping, OUYANG Rongbin, FAN Chun. Two-stage Multi-factor Algorithm for Job Runtime Prediction Based on Usage Characteristics [J]. Computer Science, 2025, 52(2): 261-267. |
[3] | CHEN Liang, SUN Cong. Deep-learning Based DKOM Attack Detection for Linux System [J]. Computer Science, 2024, 51(9): 383-392. |
[4] | LING Shixiang, YANG Zhibin, ZHOU Yong. Integrated Avionics Software Code Automatic Generation Method for ARINC653 Operating System [J]. Computer Science, 2024, 51(7): 10-21. |
[5] | WANG Zhen, ZHOU Chao, FAN Yongwen, Shi Pengfei. Overview of Unmanned Aerial Vehicle Systems Security [J]. Computer Science, 2024, 51(6A): 230800086-6. |
[6] | ZHANG Tao, LIAO Bin, YU Jiong, LI Ming, SUN Ruina. Benchmarking and Analysis for Graph Neural Network Node Classification Task [J]. Computer Science, 2024, 51(4): 132-150. |
[7] | LUO Haiwen, WU Yangjun, SHANG Honghui. Many-core Optimization Method for the Calculation of Ab initio Polarizability [J]. Computer Science, 2023, 50(6): 1-9. |
[8] | ZHU Taojie, LU Jicang, ZHOU Gang, DING Xiaoyao, WANG Ling, ZHU Xiubao. Review of Document-level Relation Extraction Techniques [J]. Computer Science, 2023, 50(5): 189-200. |
[9] | HUANG Rongfeng, LIU Shifang, ZHAO Yonghua. Batched Eigenvalue Decomposition Algorithms for Hermitian Matrices on GPU [J]. Computer Science, 2023, 50(4): 397-403. |
[10] | LU Pingjing, XIONG Zeyu, LAI Mingche. Survey on High-performance Computing Technology and Standards [J]. Computer Science, 2023, 50(11): 1-7. |
[11] | DENG Lin, ZHANG Yao, LUO Jiahao. Fast Performance Evaluation Method for Processor Design [J]. Computer Science, 2023, 50(11): 15-22. |
[12] | LIU Lin-yun, CHEN Kai-yan, LI Xiong-wei, ZHANG Yang, XIE Fang-fang. Overview of Side Channel Analysis Based on Convolutional Neural Network [J]. Computer Science, 2022, 49(5): 296-302. |
[13] | LI Hao-dong, HU Jie, FAN Qin-qin. Multimodal Multi-objective Optimization Based on Parallel Zoning Search and Its Application [J]. Computer Science, 2022, 49(5): 212-220. |
[14] | LI Zhi-ying, MA Shuo, ZHOU Chao, MA Ying-jin, LIU Qian, JIN Zhong. “AI+HPC”-based Time Prediction for the First Principle Calculations and Its Applications in Biomed Community [J]. Computer Science, 2022, 49(10): 36-43. |
[15] | QU Wei, YU Fei-hong. Survey of Research on Asymmetric Embedded System Based on Multi-core Processor [J]. Computer Science, 2021, 48(6A): 538-542. |
|