Computer Science ›› 2025, Vol. 52 ›› Issue (5): 11-24.doi: 10.11896/jsjkx.240500103

• High Performance Computing • Previous Articles     Next Articles

Performance Evaluation and Optimization of Operating System for Domestic Supercomputer

GAO Yiqin, LUO Zhiyu, WANG Yichao, LIN Xinhua   

  1. Network & Information Center,Shanghai Jiao Tong University,Shanghai 200240,China
  • Received:2024-05-23 Revised:2024-09-05 Online:2025-05-15 Published:2025-05-12
  • About author:GAO Yiqin,born in 1995,Ph.D,engineer,is a member of CCF(No. L2004M).Her main research interests include high performance computing and task scheduling algorithm designing.
    LIN Xinhua,born in 1979,Ph.D,senior engineer,is a member of CCF(No.23737D).His main research interests include high performance computing and computer architecture.
  • Supported by:
    National Key Research and Development Program of China(2023YFB3002001).

Abstract: Supercomputers play a crucial role in supporting scientific computing applications.During these five years,our country is developing post-exascale domestic supercomputers.As one of the core components of supercomputers,the operating system's overhead will impact the performance of the supercomputer system.Therefore,the evaluation of the OS is one of the important subjects in supercomputer research.Among existing domestic OSs,openEuler offers high performance and compatibility on systems equipped with Kunpeng processors.However,openEuler has not been extensively applied to supercomputers.Therefore,it is necessary to evaluate its performance on supercomputers,and optimize the existing performance bottlenecks.Our work can be divided into two parts.1)We evaluate the compatibility of openEuler and its performance when running HPC applications.CentOS is used as a reference for comparison.The evaluation results show that when running non-communication-intensive applications,the performance of openEuler is comparable to CentOS.However,when using OpenMPI for collective communication operations such as Allreduce,the performance on openEuler decreases by up to 76.83%.Additionally,under thousand-core scale,the parallel efficiency of communication-intensive applications on openEuler decreases by up to 23.01%.2)Based on the performance issues with MPI collective communication identified during the evaluation process,we propose a performance modeling and optimization method.This method relies on the Hockney model of point-to-point communication to model the performance of various collective communication algorithm implementations.It predicts communication time under different numbers of processes and message sizes,enabling the selection of suitable collective communication algorithm implementations.Utilizing the MCA interface of OpenMPI,this method allows for dynamic adjustment of algorithm implementations at runtime.After optimization,the perfor-mance of HPC applications on openEuler has been significantly improved,with a maximum reduction in running time of 26%.

Key words: High-performance computing, Domestic supercomputer, Operating system, Performance evaluation, Collective communication performance

CLC Number: 

  • TP316
[1]Home-|TOP500[EB/OL].[2024-05-20].https://www.top500.org/.
[2]CentOS Project shifts focus to CentOS Stream - Blog.CentOS.org[EB/OL].[2024-05-20].https://blog.centos.org/2020/12/future-is-centos-stream/.
[3]NI G.The essence of information security is autonomous and controllable[J].China Economy & Informatization,2013(5):18-19.
[4]ZHOU M,HU X,XIONG W.openEuler:Advancing a Hardware and Software Application Ecosystem[J].IEEE Software,2022,39(2):101-105.
[5]WANG R,WANG Q,HU Y,et al.Industry practice of configuration auto-tuning for cloud applications and services[C]//Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering.New York,NY,USA:Association for Computing Machinery,2022:1555-1565.
[6]UOS Community[EB/OL].[2024-05-20].https://www.chi-nauos.com/.
[7]GitHub - HPC-SJTU/Performance_Evaluation_of_openEuler at dev[EB/OL].[2024-05-20].https://github.com/HPC-SJTU/Performance_Evaluation_of_openEuler/tree/dev.
[8]GEROFI B,ISHIKAWA Y,RIESEN R,et al.Operating Systems for Supercomputers and High Performance Computing[M].Singapore:Springer Singapore,2019.
[9]RIESEN R,WHEAT S R,MACCABE A B.Active messagesversus explicit message passing under SUNMOS:SAND-94-1582C;CONF-9406205-3[R].Sandia National Labs.,Albuquerque,NM(United States),1994.
[10]SHIMIZU M,UKAI T,SANPEI H,et al.HSFS:Hitachi striping file system for super technical server SR11000[C]//Forum on Information Technology Letters.2005.
[11]YOSHII K,ISKRA K,NAIK H,et al.Performance and Scalability Evaluation of ‘Big Memory' on Blue Gene Linux[J].The International Journal of High Performance Computing Applications,2011,25(2):148-160.
[12]NEC SX-Aurora TSUBASA[EB/OL].[2024-05-20].https://www.nec.com/en/global/solutions/hpc/sx/index.html.
[13]GIAMPAPA M,GOODING T,INGLETT T,et al.Experiences with a Lightweight Supercomputer Kernel:Lessons Learned from Blue Gene's CNK[C]//Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing.Networking,Storage and Analysis.2010:1-10.
[14]GSCHWIND M.Blue Gene/Q:design for sustained multi-petaflop computing[C]//Proceedings of the 26th ACM International Conference on Supercomputing.New York,NY,USA:Association for Computing Machinery,2012:245-246.
[15]GEROFI B,TARUMIZU K,ZHANG L,et al.Linux vs.lightweight multi-kernels for high performance computing:experiences at pre-exascale[C]//Proceedings of the International Conference for High Performance Computing,Networking,Storage and Analysis.New York,NY,USA:Association for Computing Machinery,2021:1-13.
[16]SHIMOSAWA T,GEROFI B,TAKAGI M,et al.Interface for heterogeneous kernels:A framework to enable hybrid OS designs targeting high performance computing on manycore architectures[C]//2014 21st International Conference on High Performance Computing(HiPC).2014:1-10.
[17]WISNIEWSKI R W,INGLETT T,KEPPEL P,et al.mOS:an architecture for extreme-scale operating systems[C]//Procee-dings of the 4th International Workshop on Runtime and Opera-ting Systems for Supercomputers.New York,NY,USA:Association for Computing Machinery,2014:1-8.
[18]KLIMIANKOU Y.Towards practical multikernel OSes withMySyS[C]//Proceedings of the 13th ACM SIGOPS Asia-Pacific Workshop on Systems.New York,NY,USA:Association for Computing Machinery,2022:29-37.
[19]Rocky Linux[EB/OL].[2024-05-20].https://rockylinux.org/.
[20]AlmaLinux OS - Forever-Free Enterprise-Grade Operating System[EB/OL].[2024-05-20].https://almalinux.org/.
[21]LI J Q,LIAO X K,MA J.A Typical Commercial Application for Kylin Operating System[C]//CSMA 2017.2017.
[22]About Anolis OS 8[EB/OL].[2024-05-20].https://openanolis.cn/anolisos/.
[23]Advantech has completed product compatibility mutual certification with UOS and Kirin OS based on Zhaoxin platform industrial motherboard[J].Microcontrollers & Embedded Systems,2022,22(11):96.
[24]GROPP W,LUSK E,DOSS N,et al.A high-performance,portable implementation of the MPI message passing interface standard[J].Parallel Computing,1996,22(6):789-828.
[25]LIU J,JIANG W,WYCKOFF P,et al.Design and implementation of MPICH2 over InfiniBand with RDMA support[C]//18th International Parallel and Distributed Processing Symposium,2004.Proceedings.2004.
[26]CHEN S,HE W,QI F,et al.Hybrid Approach to Optimize MPI Collectives by In-network-computation and Point-to-Point Messages[C]//2022 7th International Conference on Computer and Communication Systems(ICCCS).2022:773-783.
[27]MOODY A,FERNANDEZ J,PETRINI F,et al.Scalable NIC-based Reduction on Large-scale Clusters[C]//Proceedings of the 2003 ACM/IEEE conference on Supercomputing.New York,NY,USA:Association for Computing Machinery,2003.
[28]PETRINI F,COLL S,FRACHTENBERG E,et al.Hardware- and software-based collective communication on the Quadrics network[C]//Proceedings IEEE International Symposium on Network Computing and Applications.NCA 2001.2001:24-35.
[29]ALMASI G,ARCHER C,CASTANOS J G,et al.Design andimplementation of message-passing services for the Blue Gene/L supercomputer[J].IBM Journal of Research and Development,2005,49(2/3):393-406.
[30]WILKINS M,GUO Y,THAKUR R,et al.ACCLAiM:Advancing the Practicality of MPI Collective Communication Autotuning Using Machine Learning[C]//2022 IEEE International Conference on Cluster Computing(CLUSTER).2022:161-171.
[31]HASANOV K,LASTOVETSKY A.Hierarchical redesign ofclassic MPI reduction algorithms[J].The Journal of Supercomputing,2017,73(2):713-725.
[32]NURIYEV E,RICO-GALLEGO J A,LASTOVETSKY A.Model-based selection of optimal MPI broadcast algorithms for multi-core clusters[J].Journal of Parallel and Distributed Computing,2022,165:1-16.
[33]DONGARRA J J,LUSZCZEK P,PETITET A.The LINPACK Benchmark:past,present and future[J].Concurrency and Computation:Practice and Experience,2003,15(9):803-820.
[34]First Experiences in Performance Benchmarking with the NewSPEChpc 2021 Suites [J].arXiv:2203.06751,2021.
[35]SJTU HPC[EB/OL].[2024-05-20].https://hpc.sjtu.edu.cn/Item/Hardware.htm.
[36]XIA J,CHENG C,ZHOU X,et al.Kunpeng 920:The First 7-nm Chiplet-Based 64-Core ARM SoC for Cloud Services[J].IEEE Micro,2021,41(5):67-75.
[37]LANKES S,PICKARTZ S,BREITBART J.HermitCore:AUnikernel for Extreme Scale Computing[C]//Proceedings of the 6th International Workshop on Runtime and Operating Systems for Supercomputers.New York,NY,USA:Association for Computing Machinery,2016:1-8.
[38]GEROFI B,RIESEN R,TAKAGI M,et al.Performance andScalability of Lightweight Multi-kernel Based Operating Systems[C]//2018 IEEE International Parallel and Distributed Processing Symposium(IPDPS).2018:116-125.
[39]CHA S J,JEON S H,JEONG Y J,et al.OS noise Analysis on Azalea-unikernel[C]//2022 24th International Conference on Advanced Communication Technology(ICACT).2022:81-84.
[40]XU H,HU Y,TAN B,et al.Fault Injection based Failure Analysis of three CentOS-like Operating Systems[J].arXiv:2210.08728,2023.
[41]CHUNDURI S,PARKER S,BALAJI P,et al.Characterization of MPI Usage on a Production Supercomputer[C]//SC18:International Conference for High Performance Computing,Networking,Storage and Analysis.2018:386-400.
[42]HOCKNEY R W.The communication challenge for MPP:IntelParagon and Meiko CS-2[J].Parallel Computing,1994,20(3):389-398.
[43]KARP R M,SAHAY A,SANTOS E E,et al.Optimal broadcast and summation in the LogP model[C]//Proceedings of the Fifth Annual ACM Symposium on Parallel Algorithms andArchite-ctures.New York,NY,USA:Association for Computing Machi-nery,1993:142-153.
[44]ALEXANDROV A,IONESCU M F,SCHAUSER K E,et al.LogGP:incorporating long messages into the LogP model-one step closer towards a realistic model for parallel computation[C]//Proceedings of the Seventh Annual ACM Symposium on Parallel Algorithms and Architectures.New York,NY,USA:Association for Computing Machinery,1995:95-105.
[45]KIELMANN T,BAL H E,VERSTOEP K.Fast Measurement of LogP Parameters for Message Passing Platforms[C]//Parallel and Distributed Processing.Berlin,Heidelberg:Springer,2000:1176-1183.
[46]RICO-GALLEGO J A,DÍAZ-MARTÍN J C.τ-Lop:Modelingperformance of shared memory MPI[J].Parallel Computing,2015,46:14-31.
[47]THAKUR R,RABENSEIFNER R,GROPP W.Optimization of Collective Communication Operations in MPICH[J].The International Journal of High Performance Computing Applications,2005,19(1):49-66.
[48]CHAN E,HEIMLICH M,PURKAYASTHA A,et al.Collec-tive communication:theory,practice,and experience[J].Concurrency and Computation:Practice and Experience,2007,19(13):1749-1783.
[49]RABENSEIFNER R,TRÄFF J L.More Efficient Reduction Algorithms for Non-Power-of-Two Number of Processors in Message-Passing Parallel Systems[C]//Recent Advances in Parallel Virtual Machine and Message Passing Interface.Berlin,Heidelberg:Springer,2004:36-46.
[50]CULLER D,KARP R,PATTERSON D,et al.LogP:towards a realistic model of parallel computation[J].ACM SIGPLAN Notices,1993,28(7):1-12.
[51]VADHIYAR S S,FAGG G E,DONGARRA J.AutomaticallyTuned Collective Communications[C]//Proceedings of the 2000 ACM/IEEE Conference on Supercomputing.2000.
[52]THAKUR R,GROPP W D.Improving the Performance of Collective Operations in MPICH[C]//Recent Advances in Parallel Virtual Machine and Message Passing Interface.Berlin,Heidelberg:Springer,2003:257-267.
[53]PJEŠIVAC-GRBOVIĆ J,BOSILCA G,FAGG G E,et al.Decision Trees and MPI Collective Algorithm Selection Problem[C]//Euro-Par 2007 Parallel Processing.Berlin,Heidelberg:Springer,2007:107-117.
[54]QUINLAN J R.C4.5:Programs for Machine Learning[M].Elsevier,2014.
[55]HUNOLD S,STEINER S.OMPICollTune:Autotuning MPICollectives by Incremental Online Learning[C]//2022 IEEE/ACM International Workshop on Performance Modeling,Benchmarking and Simulation of High Performance Computer Systems(PMBS).2022:123-128.
[56]HUNOLD S,BHATELE A,BOSILCA G,et al.Predicting MPI Collective Communication Performance Using Machine Learning[C]//2020 IEEE International Conference on Cluster Computing(CLUSTER).2020:259-269.
[1] LIAO Qiucheng, ZHOU Yang, LIN Xinhua. Metrics and Tools for Evaluating the Deviation in Parallel Timing [J]. Computer Science, 2025, 52(5): 41-49.
[2] SHANG Qiuyan, LI Yicong, WEN Ruilin, MA Yinping, OUYANG Rongbin, FAN Chun. Two-stage Multi-factor Algorithm for Job Runtime Prediction Based on Usage Characteristics [J]. Computer Science, 2025, 52(2): 261-267.
[3] CHEN Liang, SUN Cong. Deep-learning Based DKOM Attack Detection for Linux System [J]. Computer Science, 2024, 51(9): 383-392.
[4] LING Shixiang, YANG Zhibin, ZHOU Yong. Integrated Avionics Software Code Automatic Generation Method for ARINC653 Operating System [J]. Computer Science, 2024, 51(7): 10-21.
[5] WANG Zhen, ZHOU Chao, FAN Yongwen, Shi Pengfei. Overview of Unmanned Aerial Vehicle Systems Security [J]. Computer Science, 2024, 51(6A): 230800086-6.
[6] ZHANG Tao, LIAO Bin, YU Jiong, LI Ming, SUN Ruina. Benchmarking and Analysis for Graph Neural Network Node Classification Task [J]. Computer Science, 2024, 51(4): 132-150.
[7] LUO Haiwen, WU Yangjun, SHANG Honghui. Many-core Optimization Method for the Calculation of Ab initio Polarizability [J]. Computer Science, 2023, 50(6): 1-9.
[8] ZHU Taojie, LU Jicang, ZHOU Gang, DING Xiaoyao, WANG Ling, ZHU Xiubao. Review of Document-level Relation Extraction Techniques [J]. Computer Science, 2023, 50(5): 189-200.
[9] HUANG Rongfeng, LIU Shifang, ZHAO Yonghua. Batched Eigenvalue Decomposition Algorithms for Hermitian Matrices on GPU [J]. Computer Science, 2023, 50(4): 397-403.
[10] LU Pingjing, XIONG Zeyu, LAI Mingche. Survey on High-performance Computing Technology and Standards [J]. Computer Science, 2023, 50(11): 1-7.
[11] DENG Lin, ZHANG Yao, LUO Jiahao. Fast Performance Evaluation Method for Processor Design [J]. Computer Science, 2023, 50(11): 15-22.
[12] LIU Lin-yun, CHEN Kai-yan, LI Xiong-wei, ZHANG Yang, XIE Fang-fang. Overview of Side Channel Analysis Based on Convolutional Neural Network [J]. Computer Science, 2022, 49(5): 296-302.
[13] LI Hao-dong, HU Jie, FAN Qin-qin. Multimodal Multi-objective Optimization Based on Parallel Zoning Search and Its Application [J]. Computer Science, 2022, 49(5): 212-220.
[14] LI Zhi-ying, MA Shuo, ZHOU Chao, MA Ying-jin, LIU Qian, JIN Zhong. “AI+HPC”-based Time Prediction for the First Principle Calculations and Its Applications in Biomed Community [J]. Computer Science, 2022, 49(10): 36-43.
[15] QU Wei, YU Fei-hong. Survey of Research on Asymmetric Embedded System Based on Multi-core Processor [J]. Computer Science, 2021, 48(6A): 538-542.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!