计算机科学 ›› 2023, Vol. 50 ›› Issue (2): 353-363.doi: 10.11896/jsjkx.220100163

• 交叉&前沿 • 上一篇    下一篇

面向高性能计算系统的容器技术综述

陈轶阳1,2, 王小宁1, 卢莎莎1, 肖海力1   

  1. 1 中国科学院计算机网络信息中心 北京 100190
    2 中国科学院大学计算机科学与技术学院 北京 100049
  • 收稿日期:2022-01-18 修回日期:2022-07-19 出版日期:2023-02-15 发布日期:2023-02-22
  • 通讯作者: 王小宁(wxn@sccas.cn)
  • 作者简介:(chenyiyang@cnic.cn)
  • 基金资助:
    中国科学院战略性先导科技专项项目(A类)(XDA19020101)

Survey of Container Technology for High-performance Computing System

CHEN Yiyang1,2, WANG Xiaoning1, LU Shasha1, XIAO Haili1   

  1. 1 Computer Network Information Center,Chinese Academy of Sciences,Beijing 100190,China
    2 School of Computer Science and Technology,University of Chinese Academy of Sciences,Beijing 100049,China
  • Received:2022-01-18 Revised:2022-07-19 Online:2023-02-15 Published:2023-02-22
  • Supported by:
    Strategic Priority Research Program of the Chinese Academy of Sciences(XDA19020101)

摘要: 容器技术在云计算行业已得到普遍使用,主要用于服务软件环境的快速移植和自动化部署。随着高性能计算、大数据、人工智能技术的深度融合,高性能计算系统的应用软件依赖和配置日益复杂,超算中心对用户自定义软件栈的需求越来越强烈。因此,容器技术在高性能计算系统的应用环境下也发展出多种实现软件,以满足用户自定义软件栈等实际需求。文中总结了容器技术的发展历史,阐述了容器在Linux平台的技术原理,分析并评价了于适用高性能计算系统的容器实现软件,最后展望未来面向高性能计算系统的容器技术研究方向。

关键词: 高性能计算, 容器, 虚拟化, 应用软件部署

Abstract: Container technology has been widely used in the cloud computing industry,mainly for rapid migration and automated deployment of service software environments.With the deep integration of high performance computing,big data and artificial intelligence technologies,the application software dependency and configuration of high performance computing systems are beco-ming increasingly complex,and the demand for user-defined software stacks in supercomputing centers is getting stronger.Therefore,in the application environment of high-performance computing systems,a variety of container implementations have also been developed to meet the practical needs such as user-defined software stacks.This paper summarizes the development history of container technology,explains the technical principles of containers in Linux platform,analyzes and evaluates the container implementation software for high-performance computing systems,and finally the future research direction of container technology for high-performance computing system is prospected.

Key words: High performance computing, Container, Virtualization, Application software deployment

中图分类号: 

  • TP391
[1]MERKEL D.Docker:lightweight Linux containers for consis-tent development and deployment [J].Linux Journal,2014,2014(239):76-91.
[2]JARAMILLO D,NGUYEN D V,SMART R.Leveraging mi-croservices architecture by using Docker technology[C]//SoutheastCon 2016.2016:1-5.
[3]KURTZER G M,SOCHAT V,BAUER M W.Singularity:Scientific containers for mobility of compute [J].PLOS ONE,2017,12(5):e0177459.
[4]JACOBSE N,DOUGLAS M,CANON R,et al.Contain this,unleashing docker for hpc [C]//Proceedings of the Cray User Group.2015:33-49.
[5]REID P,TIM R.Charliecloud:unprivileged containers for user-defined software stacks in HPC[C]//Proceedings of the International Conference for High Performance Computing,Networking,Storage and Analysis.2017:1-10.
[6]TIWARI D,GUPTA S,ROGERS J,et al.Understanding GPU errors on large-scale HPC systems and the implications for system design and operation[C]//2015 IEEE 21st International Symposium on High Performance Computer Architecture(HPCA).2015:331-342.
[7]LIU J,WU J,PANDA D K.High performance RDMA-basedMPI implementation over InfiniBand [J].International Journal of Parallel Programming,2004,32(3):167-198.
[8]PEARCE M,ZEADALLY S,HUNT R.Virtualization:Issues,security threats,and solutions [J].ACM Computing Surveys,2013,45(2):1-39.
[9]LAADAN O,NIEH J.Operating system virtualization:practice and experience[C]//Proceedings of the 3rd Annual Haifa Experimental Systems Conference.2010:1-12.
[10]POPEK G J,GOLDBERG R P.Formal requirements for virtua-lizable third generation architectures [J].Communications of the ACM,1974,17(7):412-421.
[11]BUZEN J P,GAGLIARDI U O.The evolution of virtual machine architecture[C]//National Computer Conference.1973:291-299.
[12]KAMP P H,WATSON R N.Jails:Confining the omnipotentroot[C]//Proceedings of the 2nd International SANE Confe-rence.2000:116-127.
[13]DES LIGNERIS B.Virtualization of Linux based computers:the Linux-VServer project[C]//19th International Symposium on High Performance Computing Systems and Applications(HPCS'05).2005:340-346.
[14]TUCKER A,COMAY D.Solaris Zones:Operating System Support for Server Consolidation[C]//Virtual Machine Research and Technology Symposium.2004:241-254.
[15]MENAGE P B.Adding generic process containers to the linux kernel[C]//Proceedings of the Linux Symposium.2007:45-57.
[16]MICHAEL K.namespaces(7) - Linux manual page [EB/OL].(2021-08-27) [2021-12-27].https://man7.org/linux/man-pages/man7/namespaces.7.html.
[17]MICHAEL K.cgroups(7)-Linux manual page [EB/OL].(2021-08-27) [2021-12-27].https://man7.org/linux/man-pages/man7/cgroups.7.html.
[18]LEITE L,ROCHA C,KON F,et al.A survey of DevOps concepts and challenges [J].ACM Computing Surveys(CSUR),2019,52(6):1-35.
[19]JAKE E.CoreOS:A different kind of Linux distribution [EB/OL].(2014-04-09) [2021-12-27].https://lwn.net/Articles/593928/.
[20]NATHAN W.The Rocket containerization system [EB/OL].(2014-12-03) [2021-12-27].https://lwn.net/Articles/624349/.
[21]DEBAB R,HIDOUCI W K.Containers Runtimes War:A Comparative Study[C]//Proceedings of the Future Technologies Conference.2020:135-161.
[22]KUMAR R,THANGARAJU B.Performance analysis between RunC and kata container runtime[C]//2020 IEEE International Conference on Electronics,Computing and Communication Technologies(CONECCT).2020:1-4.
[23]XAVIER M G,NEVES M V,ROSSI F D,et al.PerformanceEvaluation of Container-Based Virtualization for High Perfor-mance Computing Environments[C]//2013 21st Euromicro International Conference on Parallel,Distributed,and Network-Based Processing.2013:233-240.
[24]LIU P N,GUITART J.Performance comparison of multi-container deployment schemes for HPC workloads:an empirical study [J].Journal of Supercomputing,2021,77(6):6273-6312.
[25]ABRAHAM S,PAUL A K,KHAN R I S,et al.On the Use of Containers in High Performance Computing Environments[C]//IEEE 13th International Conference on Cloud Computing(CLOUD).2020:284-293.
[26]JAERYUN L,CHAE Y,TAK B.Comparative Analysis of Container for High Performance Computing [J].Journal of The Korea Society of Computer and Information,2020,25(9):11-20.
[27]TORREZ A,RANDLES T,PRIEDHORSKY R,et al.HPC container runtimes have minimal or no performance impact[C]//1st IEEE/ACM International Workshop on Containers and New Orchestration Paradigms for Isolated Environments in HPC(CANOPIE-HPC).2019:37-42.
[28]YONG C H,LEE G W,HUH E N.Proposal of Container-Based HPC Structures and Performance Analysis [J].Journal of Information Processing Systems,2018,14(6):1398-1404.
[29]ZHANG J,LU X Y,PANDA D K,et al.Is Singularity-based Container Technology Ready for Running MPI Applications on HPC Clouds?[C]//10th International Conference on Utility and Cloud Computing(UCC) / 4th International Conference on Big Data Computing,Applications and Technologies(BDCAT).2017:151-160.
[30]MICHAEL K.mount_namespaces(7)-inux manual page [EB/OL].(2021-08-27) [2021-12-27].https://man7.org/linux/man-pages/man7/mount_namespaces.7.html.
[31]MICHAEL K.uts_namespaces(7)-Linux manual page [EB/OL].(2019-11-19) [2021-12-27].https://man7.org/linux/man-pages/man7/uts_namespaces.7.html.
[32]MICHAEL K.ipc_namespaces(7)-Linux manual page [EB/OL].(2019-08-02) [2021-12-27].https://man7.org/linux/man-pages/man7/ipc_namespaces.7.html.
[33]MICHAEL K.pid_namespaces(7)-Linux manual page [EB/OL].(2020-11-01) [2021-12-27].https://man7.org/linux/man-pages/man7/pid_namespaces.7.html.
[34]MICHAEL K.network_namespaces(7)-Linux manual page [EB/OL].(2020-06-09) [2021-12-27].https://man7.org/linux/man-pages/man7/network_namespaces.7.html.
[35]MICHAEL K.user_namespaces(7)-Linux manual page [EB/OL].(2021-08-27) [2021-12-27].https://man7.org/linux/man-pages/man7/user_namespaces.7.html.
[36]MICHAEL K.clone(2)-Linux manual page [EB/OL].(2021-03-22) [2021-12-27].https://man7.org/linux/man-pages/man2/clone.2.html.
[37]MICHAEL K.setns(2)-Linux manual page [EB/OL].(2020-08-13) [2021-12-27].https://man7.org/linux/man-pages/man2/setns.2.html.
[38]MICHAEL K.unshare(2)-Linux manual page [EB/OL].(2021-03-22) [2021-12-27].https://man7.org/linux/man-pages/man2/unshare.2.html.
[39]WRIGHT C P.Kernel korner:unionfs:bringing filesystems together [J].Linux Journal,2004,128:24-29.
[40]RIOUX P,KIAR G,HUTTON A,et al.Deploying large fixed file datasets with SquashFS and Singularity[C]//Practice and Experience in Advanced Research Computing.2020:72-76.
[41]XU Q,AWASTHI M,MALLADI K T,et al.Docker characte-rization on high performance SSDs[C]//2017 IEEEInterna-tional Symposium on Performance Analysis of Systems and Software(ISPASS).2017:133-134.
[42]VEIGA V S,SIMON M,AZAB A,et al.Evaluation and Benchmarking of Singularity MPI Containers on EU Research e-Infrastructures[C]//1st IEEE/ACM International Workshop on Containers and New Orchestration Paradigms for Isolated Environments in HPC(CANOPIE-HPC).2019:1-10.
[43]ALBERTO M,LUCAS B,FELIPE A C,et al.Shifter at CSCS-Docker Containers for HPC [EB/OL].(2018-04-09) [2021-12-27].http://www.hpcadvisorycouncil.com/events/2018/swiss-workshop/pdf/Monday09April/Madonna_ShifterDockerContai-ners_Mon_090418.pdf.
[44]BENEDICIC L,CRUZ F A,MADONNA A,et al.Sarus:Highly Scalable Docker Containers for HPC Systems[C]//IEEE International Conference on High Performance Computing,Data,and Analytics.2019:46-60.
[45]BRIAN A,WAHID B,TINA B,et al.2014 NERSC Workload Analysis [EB/OL].(2014-11-05) [2021-12-27].https://portal.nersc.gov/project/mpccc/baustin/NERSC_2014_Workload_Analysis_v1.1.pdf.
[46]AUSTIN B.NERSC-10 Workload Analysis(Data from 2018) [EB/OL].(2020-04-01) [2021-12-27].https://portal.nersc.gov/project/m888/nersc10/workload/N10_Workload_Analysis.latest.pdf.
[47]MICHELLE M.Enabling User Defined Software Stacks with Singularity [EB/OL].(2021-11-15) [2021-12-27].https://www.nas.nasa.gov/hecc/support/kb/enabling-user-defined-software-stacks-with-singularity_637.html.
[48]FRANCESCO D,MARTINO,ENRICO U,et al.AWS ParallelCluster [EB/OL].https://github.com/aws/aws-parallelcluster.
[49]CHRISTIAN K.HPC Container Engine State-of-Art [EB/OL].(2021-02-06) [2021-12-27].https://containers-on-pcluster.workshop.aws/.
[50]ALIBABA-CLOUD.Elastic High Performance Computing [EB/OL].https://www.alibabacloud.com/product/ehpc.
[51]ALIBABA-CLOUD.E-HPC:Use high-performance containerapplications [EB/OL].https://help.aliyun.com/document_detail/102579.html?spm=5176.21213303.J_6704733920.37.5dab3eda2NILb8&scm=20140722.S_help%40%40%E6%96%87%E6%A1%A3%40%40102579.S_0%2Bos0.ID_102579-RL_singularity-OR_helpmain-V_2-P0_6.
[52]GROPP W,LUSK E,DOSS N,et al.A high-performance,portable implementation of the MPI message passing interface stan-dard [J].Parallel Computing,1996,22(6):789-828.
[53]WOFFORD Q,BRIDGES P G,WIDENER P.A Layered Approach for Modular Container Construction and Orchestration in HPC Environments[C]//Proceedings of the 11th Workshop on Scientific Cloud Computing.2020:1-8.
[54]BALAJI P,BUNTINAS D,GOODELL D,et al.PMI:A scalable parallel process-management interface for extreme-scale systems[C]//European MPI Users' Group Meeting.2010:31-41.
[55]GERHARDT L,BHIMJI W,CANON S,et al.Shifter:Contai-ners for HPC [J].Journal of Physics:Conference Series,2017,898:082021.
[56]PRIEDHORSKY R,CANON R S,RANDLES T,et al.Minimizing privilege for building HPC containers[C]//IEEE International Conference on High Performance Computing,Data,and Analytics.2021:1-14.
[57]NETTO M A,CALHEIROS R N,RODRIGUES E R,et al.HPC cloud for scientific and business applications:taxonomy,vision,and research challenges [J].ACM Computing Surveys,2018,51(1):1-29.
[58]TOP500.ORG.Amazon Web Services [EB/OL].https://www.top500.org/site/50321/.
[59]MEDEL V,RANA O,BAÑARES J Á,et al.Modelling perfor-mance & resource management in kubernetes[C]//Proceedings of the 9th International Conference on Utility and Cloud Computing.2016:257-262.
[60]ZHENG W M.Architecture and evaluation of high-performance computers for processing artificial intelligence applications[J].Journal of Chongqing University of Posts and Telecommunications(Natural Science Edition),2021,33(2):171-175.
[61]ZANG D,YANG Z G,WANG J,et al.Design of High-Perfor-mance Container Network Based on Network Interface Card Virtualization[J].Computer Engineering,2022,48(7):214-219.
[1] 李浩东, 胡洁, 范勤勤.
基于并行分区搜索的多模态多目标优化及其应用
Multimodal Multi-objective Optimization Based on Parallel Zoning Search and Its Application
计算机科学, 2022, 49(5): 212-220. https://doi.org/10.11896/jsjkx.210300019
[2] 陈港, 孟相如, 康巧燕, 翟东.
基于最小生成树的vSDN故障快速恢复算法
vSDN Fault Recovery Algorithm Based on Minimum Spanning Tree
计算机科学, 2022, 49(11A): 211200034-7. https://doi.org/10.11896/jsjkx.211200034
[3] 李治莹, 马硕, 周超, 马英晋, 刘倩, 金钟.
基于“AI+HPC”的第一原理计算时间预测及其在社区平台中的应用
“AI+HPC”-based Time Prediction for the First Principle Calculations and Its Applications in Biomed Community
计算机科学, 2022, 49(10): 36-43. https://doi.org/10.11896/jsjkx.220100129
[4] 刘邦邦, 易国洪, 黄祖源.
面向Docker容器的动态负载算法
Dynamic Loading Algorithm for Docker Container
计算机科学, 2021, 48(6): 276-281. https://doi.org/10.11896/jsjkx.200500152
[5] 高明, 周慧颖, 焦海, 应丽莉.
基于加权图的链路映射算法
Link Mapping Algorithm Based on Weighted Graph
计算机科学, 2021, 48(11A): 476-480. https://doi.org/10.11896/jsjkx.201200216
[6] 陶志勇, 张锦, 阳王东, 陈为满.
基于双层虚拟思想的边缘设备性能优化研究
Study on Performance Optimization of Edge Devices Based on Two-layer Virtualization
计算机科学, 2021, 48(11): 372-377. https://doi.org/10.11896/jsjkx.210400061
[7] 高雅卓, 刘亚群, 张国敏, 邢长友, 王秀磊.
基于多阶段博弈的虚拟化蜜罐动态部署机制
Multi-stage Game Based Dynamic Deployment Mechanism of Virtualized Honeypots
计算机科学, 2021, 48(10): 294-300. https://doi.org/10.11896/jsjkx.210500071
[8] 徐蕴琪, 黄荷, 金钟.
容器技术在科学计算中的应用研究
Application Research on Container Technology in Scientific Computing
计算机科学, 2021, 48(1): 319-325. https://doi.org/10.11896/jsjkx.191100111
[9] 苏畅, 张定权, 谢显中, 谭娅.
面向5G通信网络的NFV内存资源管理方法
NFV Memory Resource Management in 5G Communication Network
计算机科学, 2020, 47(9): 246-251. https://doi.org/10.11896/jsjkx.190800008
[10] 朱国晖, 张茵, 刘秀霞, 孙天骜.
节点拓扑感知的高效节能虚拟网络映射算法
Energy Efficient Virtual Network Mapping Algorithms Based on Node Topology Awareness
计算机科学, 2020, 47(9): 270-274. https://doi.org/10.11896/jsjkx.190700162
[11] 陈国良, 张玉杰.
并行计算学科发展历程
Development of Parallel Computing Subject
计算机科学, 2020, 47(8): 1-4. https://doi.org/10.11896/jsjkx.200600027
[12] 黄梅根, 汪涛, 刘亮, 庞瑞琴, 杜欢.
基于软件定义网络资源优化的虚拟网络功能部署策略
Virtual Network Function Deployment Strategy Based on Software Defined Network Resource Optimization
计算机科学, 2020, 47(6A): 404-408. https://doi.org/10.11896/JsJkx.191000116
[13] 汪洋, 李鹏, 季一木, 樊卫北, 张玉杰, 王汝传, 陈国良.
高性能计算与天文大数据研究综述
High Performance Computing and Astronomical Data:A Survey
计算机科学, 2020, 47(1): 1-6. https://doi.org/10.11896/jsjkx.190900042
[14] 颜辉, 朱伯靖, 万文, 钟英, DavidAYune.
基于超算暨HPIC-LBM的大时空尺度三维湍流磁重联
HPIC-LBM Method Based Simulation of Large Temporal-Spatial Scale 3D Turbulent Magnetic Reconnection on Supercomputer
计算机科学, 2019, 46(8): 89-94. https://doi.org/10.11896/j.issn.1002-137X.2019.08.014
[15] 李鹏飞, 陈鸣, 邓理, 钱红燕.
一种基于NFV的检测OSPF双LSA攻击的方法
NFV Based Detection Method Against Double LSAs Attack on OSPF Protocol
计算机科学, 2019, 46(6A): 343-347.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!