计算机科学 ›› 2024, Vol. 51 ›› Issue (9): 31-39.doi: 10.11896/jsjkx.230300188

• 高性能计算 • 上一篇    下一篇

基于MPI+CUDA的DSMC/PIC耦合模拟异构并行及性能优化研究

林拥真1,2, 徐传福1,2, 邱昊中1,2, 汪青松1,2, 王正华2, 杨富翔3,4, 李洁3   

  1. 1 国防科技大学计算机学院量子信息研究所兼高性能计算国家重点实验室 长沙 410000
    2 国防科技大学计算机学院 长沙 410000
    3 国防科技大学空天科学学院 长沙 410000
    4 军事交通学院 安徽 蚌埠 233000
  • 收稿日期:2023-03-23 修回日期:2023-06-29 出版日期:2024-09-15 发布日期:2024-09-10
  • 通讯作者: 徐传福(xuchuanfu@nudt.edu.cn)
  • 作者简介:(lyznudt@nudt.edu.cn)

Heterogeneous Parallel Computing and Performance Optimization for DSMC/PIC Coupled Simulation Based on MPI+CUDA

LIN Yongzhen1,2, XU Chuanfu1,2, QIU Haozhong1,2, WANG Qingsong1,2, WANG Zhenghua2, YANG Fuxiang3,4, LI Jie3   

  1. 1 Institute for Quantum Information & State Key Laboratory of High Performance Computing,National University of Defense Technology,Chang-sha 410000,China
    2 College of Computer,National University of Defense Technology,Changsha 410000,China
    3 College of Aerospace Science and Engineering,National University of Defense Technology,Changsha 410000,China
    4 Army Military Transportation University,Bengbu,Anhui 233000,China
  • Received:2023-03-23 Revised:2023-06-29 Online:2024-09-15 Published:2024-09-10
  • About author:LIN Yongzhen,born in 1997,postgra-duate.His main research interests include parallel computing and applications.
    XU Chuanfu,born in 1980,Ph.D,asso-ciate researcher,master supervisor.His main research interests include parallel computing and large-scale science and engineering computing.

摘要: DSMC/PIC耦合模拟是一类重要的高性能计算应用,大规模DSMC/PIC耦合模拟计算量巨大,需要实现高效并行计算。由于粒子动态注入、迁移等操作,基于MPI并行的DSMC/PIC耦合模拟往往通信开销较大且难以实现负载均衡。针对自主研发的DSMC/PIC耦合模拟软件,在原有MPI并行优化版本上设计实现了高效的MPI+CUDA异构并行算法,结合GPU体系结构和DSMC/PIC计算特点,开展了GPU访存优化、GPU线程工作负载优化、CPU-GPU数据传输优化及DSMC/PIC数据冲突优化等一系列性能优化。在北京北龙超级云HPC系统的NVIDIA V100和A100 GPU上,针对数亿粒子规模的脉冲真空弧等离子体羽流应用,开展了大规模DSMC/PIC耦合异构并行模拟,相比原有纯MPI并行,GPU异构并行大幅缩短了模拟时间,两块GPU卡较192核的CPU加速比达到550%,同时具有更好的强可扩展性。

关键词: DSMC/PIC耦合, 粒子模拟, 异构并行, MPI+CUDA

Abstract: DSMC/PIC coupled simulation is an important high-performance computing application that demands efficient parallel computing for large-scale simulations.Due to the dynamic injection and migration of particles,DSMC/PIC coupled simulations based on MPI parallelism often suffer from large communication overheads and are difficult to achieve load balancing.To address these issues,we design and implement efficient MPI+CUDA heterogeneous parallel algorithm based on the self-developed DSMC/PIC simulation software.Combining the characteristics of the GPU architecture and the DSMC/PIC computation,we conduct a series of performance optimizations,including GPU memory access optimization,GPU thread workload optimization,CPU-GPU data transmission optimization,and DSMC/PIC data conflict optimization.We perform large-scale DSMC/PIC coupled he-terogeneous parallel simulations on NVIDIA V100 and A100 GPUs in the Beijing Beilong Super Cloud HPC system for the pulsed vacuum arc plasma jet application with billions of particles.Compared to the original pure MPI parallelism,the GPU heterogeneous parallelism significantly reduce simulation time,with a speedup of 550% on two GPU cards compared to 192 cores of the CPU,while maintaining better strong scalability.

Key words: Coupled DSMC/PIC, Particle simulation, Heterogeneous parallel, MPI+CUDA

中图分类号: 

  • TP391
[1]BIRD G A.Molecular gas dynamics[J].NASA STI/ReconTechnical Report A,1976,76:40225.
[2]BIRDSALL C K,LANGDON A B.Plasma physics via computer simulation[M].CRC Press,2004.
[3] HOCKNEY R W,EASTWOOD J W.Computer simulationusing particles[J].SIAM Review,1983,3(25):1025102.
[4] BIRD G A.Direct simulation and the Boltzmann equation[J].The Physics of Fluids,1970,13(11):2676-2681.
[5]COPPLESTONE S,ORTWEIN P,MUNZ C D,et al.Coupled PIC-DSMC simulations of a laser-driven plasma expansion[C]//High Performance Computing in Science and Engineering'15:Transactions of the High Performance Computing Center,Stuttgart(HLRS) 2015.Springer International Publishing,2016:689-701.
[6] COPPLESTONE S,MUNZ C D,PFEIFFER M.PIC-DSMCsimulations of plasma plume expansions with ionization and recombination processes[C]//2016 IEEE International Confe-rence on Plasma Science(ICOPS).IEEE,2016.
[7]SMITH B D,BOYD I D,KAMHAWI H,et al.Hybrid-PICmodeling of a high-voltage,high-specific-impulse hall thruster[C]//49th AIAA/ASME/SAE/ASEE Joint Propulsion Con-ference.2013:3887.
[8]KORKUT B,LI Z,LEVIN D A.3-D simulation of ion thruster plumes using octree adaptive mesh refinement[J].IEEE Tran-sactions on Plasma Science,2015,43(5):1706-1721.
[9]BRIEDA L,ZHUANG T S,KEIDAR M.Near plume modeling of a micro cathode arc thruster[C]//49th AIAA/ASME/SAE/ ASEE Joint Propulsion Conference.2013:4120.
[10]TACCOGNA F,MINELLI P,BRUNO D,et al.Kinetic divertor modeling[J].Chemical Physics,2012,398:27-32.
[11]GLEASON-GONZÁLEZ C,VAROUTIS S,HAUER V,et al.Simulation of neutral gas flow in a tokamak divertor using the Direct Simulation Monte Carlo method[J].Fusion Engineering and Design,2014,89(7/8):1042-1047.
[12]XU C,ZHANG L,DENG X,et al.Balancing cpu-gpu collaborative high-order cfd simulations on the tianhe-1a supercomputer[C]//2014 IEEE 28th International Parallel and DistributedProcessing Symposium.IEEE,2014:725-734.
[13]XU C,DENG X,ZHANG L,et al.Collaborating CPU and GPU for large-scale high-order CFD simulations with complex grids on the TianHe-1A supercomputer[J].Journal of Computational Physics,2014,278:275-297.
[14]LI J,INGHAM D,MA L,et al.Numerical simulation of thechemical combination and dissociation reactions of neutral particles in a rarefied plasma arc jet[J].IEEE Transactions on Plasma Science,2017,45(3):461-471.
[15]SU Y,LI J,WANG H,et al.Numerical simulation of chemical reactions on rarefied plasma plume by DSMC method[J].IEEE Transactions on Plasma Science,2021,49(3):1214-1226.
[16]BIRD G A.Definition of mean free path for real gases[J].The Physics of fluids,1983,26(11):3222-3223.
[17]BIRD G A.Molecular gas dynamics and the direct simulation of gas flows[M].Oxford University Press,1994.
[18]BORIS J P.Relativistic plasma simulation-optimization of a hybrid code[C]//International Conference on Numerical Simulation of Plasmas.1970:3-67.
[19]LIMA E R A,TAVARES F W,BISCAIA JR E C.Finite volume solution of the modified Poisson-Boltzmann equation for two colloidal particles[J].Physical Chemistry Chemical Physics,2007,9(24):3174-3180.
[20]QIU H Z.Parallel computing of DSMC/PIC Hybrid Numerical Simulation for Three-dimensional Unsteady pulsed Vacuum Arc Plasma Plumes [C]//2021 HPC China.2021:483-491.
[21]QIU H,XU C,LI D,et al.Parallelizing and Balancing Coupled DSMC/PIC for Large-scale Particle Simulations[C]//2022 IEEE International Parallel and Distributed Processing Sympo-sium(IPDPS).IEEE,2022:390-401.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!