计算机科学 ›› 2024, Vol. 51 ›› Issue (11A): 240700055-5.doi: 10.11896/jsjkx.240700055

• 计算机软件&体系架构 • 上一篇    下一篇

面向SW26010间断有限元算法的多级并行计算

王晓忠1, 张祖雨2   

  1. 1 江苏联合职业技术学院 江苏 无锡 214000
    2 江南大学物联网工程学院 江苏 无锡 214122
  • 出版日期:2024-11-16 发布日期:2024-11-13
  • 通讯作者: 王晓忠(57439353@qq.com)
  • 基金资助:
    高等学校学科创新引智计划项目(B23008);未来网络科研基金项目(FNSRFP2021YB11)

Multi Level Parallel Computing for SW26010 Discontinuous Galerkin Finite Element Algorithm

WANG Xiaozhong1, ZHANG Zuyu2   

  1. 1 Jiangsu Union Technical Institute,Wuxi,Jiangsu 214000,China
    2 School of Internet of Things Engineering,Jiangnan University,Wuxi,Jiangsu 214122,China
  • Online:2024-11-16 Published:2024-11-13
  • About author:WANG Xiaozhong,born in 1981,M.S.,associate professor.His main research interests include control theory and control engineering,and high perfor-mance computing.
  • Supported by:
    Programme of Introducing Talents of Discipline to Universities(B23008) and Future Network Research Fund Project(FNSRFP2021YB11).

摘要: 间断有限元算法(Discontinuous Galerkin Finite Element Method,DGM)是一种高精度的数值求解算法,针对电磁工程应用中DGM并行计算效率低、计算量较大的问题,提出了基于SW26010平台的并行DGM算法。通过区域分解、数据结构重构、热点函数从核并行计算、计算与通信重叠及从核缓冲优化技术完成了DGM算法的并行优化。实现结果表明,与基于MPI进程级的DGM并行算法相比,可以获得46.8的平均加速比。

关键词: 间断有限元, 数值模拟, 并行计算, 区域分解

Abstract: The discontinuous Galerkin finite element method(DGM) is a high-precision numerical solution algorithm.Aiming at the problems of low efficiency and high computational complexity of DGM parallel computing in electromagnetic engineering applications,a parallel DGM algorithm based on the SW26010 platform is proposed.The parallel optimization of the DGM algorithm is achieved through region decomposition,data structure reconstruction,kernel parallel computing of hotspot functions,computation and communication overlap,and kernel buffering optimization techniques.Experiment results show that compared with the DGM parallel algorithm based on MPI process level,the proposed algorithm can achieve an average acceleration ratio of 46.8.

Key words: Discontinuous Galerkin finite element method, Numerical simulation, Parallel computing, Domain decomposition

中图分类号: 

  • TP391
[1]CAI Y M,ZHANG J B,YU W Z.A Predictor-Corrector Method for Power System Variable Step Numerical Simulation[J].IEEE Transactions on Power Systems,2019,34(4):3283-3285.
[2]ZHOU N,ZHOU H,HOPPE D.Containerization for High Performance Computing Systems:Survey and Prospects[J].IEEE Transactions on Software Engineering,2022,49(4):2722-2740.
[3]MARTINEZ-FERRER P J,YZELMAN A N,BELTRAN V.A Native Tensor-Vector Multiplication Algorithm for High Performance Computing[J].IEEE Transactions on Parallel and Distributed Systems,2022,33(12):3363-3374.
[4]ZAHID F,TAHERKORDI A,GRAN E G,et al.A Self-Adaptive Network for HPC Clouds:Architecture,Framework,and Implementation[J].IEEE Transactions on Parallel and Distributed Systems,2018,29(12):2658-2671.
[5]XIAO G Q,LI K I,CHEN Y D,et al.CASpMV:A Customized and Accelerative SpMV Framework for the Sunway TaihuLight[J].IEEE Transactions on Parallel and Distributed Systems,2021,32(1):131-146.
[6]LI X,XU L,YANG Z,LI B.The PML Boundary Application in the Implicit Hybridizable Discontinuous Galerkin Time-Domain Method for Waveguides[J].IEEE Microwave and Wireless Components Letters,2021,31(4):337-340.
[7]JIANG C T,XIA M M,ZHOU H.Dispersion Analysis and 3-D Wavefield Modeling of Lattice Boltzmann Model[J].IEEE Transactions on Geoscience and Remote Sensing,2024,62:1-12.
[8]CHEN Y D,XIAO G Q,YANG W D,et al.Exploiting Hierarchical Parallelism for Sparse Tensor-Vector Multiplication on Heterogeneous Parallel Systems [J].Chinese Journal of Computers,2024,47(2):441-455.
[9]XU S,WANG W,ZHANG J,et al.High Performance Computing Algorithm and Software for Heterogeneous Computing[J].Journal of Software,2021,32(8):2365-2376.
[10]XIA T,FU G L,QU S R,et al.Optimization of Parallel Computation on Sparse Matrix-Vector Multiplication with High Predictability[J].Journal of Computer Research and Development,2023,60(9):1973-1987.
[11]CREMONESI M,FRANCI A,IDELSOHN S,et al.A state of the art review of the particle finite element method(PFEM)[J].Archives of Computational Methods in Engineering,2020,27(5):1709-1735.
[12]ALHIJAZI M,ZEESHAN Q,QIN Z,et al.Finite element analysis of natural fibers composites:A review[J].Nanotechnology Reviews,2020,9(1):853-875.
[13]WU S S,DONG X S,WANG Y F,et al.UPPA:Unified Parallel Programming Architecture for Heterogeneous Systems[J].Chinese Journal of Computers,2020,43(6):990-1009.
[14]AN C Q,DU H,LI Q,et al.Memcached Optimization on High Performance I/O Technology [J].Journal of Computer Research and Development,2018,55(4):864-874.
[15]LI Y Y,XUE W,CHEN D X,et al.Performance Optimization for Spare Matrix-Vector Multiplication on Sunway Architecture[J].Chinese Journal of Computers,2020,43(6):1010-1024.
[16]SONG G H,GUO S Z,ZHAO J,et al.Automatic Mixed Precision Optimization for Stencil Computation[J].Journal of Software,2023,34(12):5704-5723.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!