面向SW26010间断有限元算法的多级并行计算

doi:10.11896/jsjkx.240700055

Abstract

Abstract: The discontinuous Galerkin finite element method(DGM) is a high-precision numerical solution algorithm.Aiming at the problems of low efficiency and high computational complexity of DGM parallel computing in electromagnetic engineering applications,a parallel DGM algorithm based on the SW26010 platform is proposed.The parallel optimization of the DGM algorithm is achieved through region decomposition,data structure reconstruction,kernel parallel computing of hotspot functions,computation and communication overlap,and kernel buffering optimization techniques.Experiment results show that compared with the DGM parallel algorithm based on MPI process level,the proposed algorithm can achieve an average acceleration ratio of 46.8.

Key words: Discontinuous Galerkin finite element method, Numerical simulation, Parallel computing, Domain decomposition

CLC Number:

TP391

WANG Xiaozhong, ZHANG Zuyu. Multi Level Parallel Computing for SW26010 Discontinuous Galerkin Finite Element Algorithm[J].Computer Science, 2024, 51(11A): 240700055-5.

References

[1]CAI Y M,ZHANG J B,YU W Z.A Predictor-Corrector Method for Power System Variable Step Numerical Simulation[J].IEEE Transactions on Power Systems,2019,34(4):3283-3285.
[2]ZHOU N,ZHOU H,HOPPE D.Containerization for High Performance Computing Systems:Survey and Prospects[J].IEEE Transactions on Software Engineering,2022,49(4):2722-2740.
[3]MARTINEZ-FERRER P J,YZELMAN A N,BELTRAN V.A Native Tensor-Vector Multiplication Algorithm for High Performance Computing[J].IEEE Transactions on Parallel and Distributed Systems,2022,33(12):3363-3374.
[4]ZAHID F,TAHERKORDI A,GRAN E G,et al.A Self-Adaptive Network for HPC Clouds:Architecture,Framework,and Implementation[J].IEEE Transactions on Parallel and Distributed Systems,2018,29(12):2658-2671.
[5]XIAO G Q,LI K I,CHEN Y D,et al.CASpMV:A Customized and Accelerative SpMV Framework for the Sunway TaihuLight[J].IEEE Transactions on Parallel and Distributed Systems,2021,32(1):131-146.
[6]LI X,XU L,YANG Z,LI B.The PML Boundary Application in the Implicit Hybridizable Discontinuous Galerkin Time-Domain Method for Waveguides[J].IEEE Microwave and Wireless Components Letters,2021,31(4):337-340.
[7]JIANG C T,XIA M M,ZHOU H.Dispersion Analysis and 3-D Wavefield Modeling of Lattice Boltzmann Model[J].IEEE Transactions on Geoscience and Remote Sensing,2024,62:1-12.
[8]CHEN Y D,XIAO G Q,YANG W D,et al.Exploiting Hierarchical Parallelism for Sparse Tensor-Vector Multiplication on Heterogeneous Parallel Systems [J].Chinese Journal of Computers,2024,47(2):441-455.
[9]XU S,WANG W,ZHANG J,et al.High Performance Computing Algorithm and Software for Heterogeneous Computing[J].Journal of Software,2021,32(8):2365-2376.
[10]XIA T,FU G L,QU S R,et al.Optimization of Parallel Computation on Sparse Matrix-Vector Multiplication with High Predictability[J].Journal of Computer Research and Development,2023,60(9):1973-1987.
[11]CREMONESI M,FRANCI A,IDELSOHN S,et al.A state of the art review of the particle finite element method(PFEM)[J].Archives of Computational Methods in Engineering,2020,27(5):1709-1735.
[12]ALHIJAZI M,ZEESHAN Q,QIN Z,et al.Finite element analysis of natural fibers composites:A review[J].Nanotechnology Reviews,2020,9(1):853-875.
[13]WU S S,DONG X S,WANG Y F,et al.UPPA:Unified Parallel Programming Architecture for Heterogeneous Systems[J].Chinese Journal of Computers,2020,43(6):990-1009.
[14]AN C Q,DU H,LI Q,et al.Memcached Optimization on High Performance I/O Technology [J].Journal of Computer Research and Development,2018,55(4):864-874.
[15]LI Y Y,XUE W,CHEN D X,et al.Performance Optimization for Spare Matrix-Vector Multiplication on Sunway Architecture[J].Chinese Journal of Computers,2020,43(6):1010-1024.
[16]SONG G H,GUO S Z,ZHAO J,et al.Automatic Mixed Precision Optimization for Stencil Computation[J].Journal of Software,2023,34(12):5704-5723.

Related Articles 15

[1]	XU He, ZHOU Tao, LI Peng, QIN Fangfang, JI Yimu. LU Parallel Decomposition Optimization Algorithm Based on Kunpeng Processor [J]. Computer Science, 2024, 51(9): 51-58.
[2]	PENG Ge, XU Xinggui, LI Zhongwu, REN Weihe, LI Kang, ZHENG Guoxian, DENG Hongyan. Adaptive Modification Turbulence Model for Flow Field of Aircraft Calculating in Three Dimensions [J]. Computer Science, 2024, 51(6A): 230900053-9.
[3]	LI Siyao, LI Shanglin, LUO Jingzhi. Parallel Computing of Reentry Vehicle Trajectory by Multiple Shooting Method Based onOPENMP [J]. Computer Science, 2024, 51(11A): 231000019-6.
[4]	HE Weilong, SU Lingli, GUO Bingxuan, LI Maosen, HAO Yan. Research and Implementation of Dynamic Scene 3D Perception Technology Based on BinocularEstimation [J]. Computer Science, 2024, 51(11A): 240300045-8.
[5]	PENG Weidong, GUO Wei, WEI Lin. Reconfigurable Computing System for Parallel Implementation of SVM Training Based on FPGA [J]. Computer Science, 2024, 51(11A): 231100120-7.
[6]	ZHAI Xulun, ZHANG Yongguang, JIN Anzhao, QIANG Wei, LI Mengbing. Parallel DVB-RCS2 Turbo Decoding on Multi-core CPU [J]. Computer Science, 2023, 50(6): 22-28.
[7]	DING Yue, XU Chuanfu, QIU Haozhong, DAI Weixi, WANG Qingsong, LIN Yongzhen, WANG Zhenghua. Study on Cross-platform Heterogeneous Parallel Computing for Lattice Boltzmann Multi-phase Flow Simulations Based on SYCL [J]. Computer Science, 2023, 50(11): 32-40.
[8]	FENG Chen, GU Jingjing. Efficient Distributed Training Framework for Federated Learning [J]. Computer Science, 2023, 50(11): 317-326.
[9]	CHEN Xin, LI Fang, DING Hai-xin, SUN Wei-ze, LIU Xin, CHEN De-xun, YE Yue-jin, HE Xiang. Parallel Optimization Method of Unstructured-grid Computing in CFD for DomesticHeterogeneous Many-core Architecture [J]. Computer Science, 2022, 49(6): 99-107.
[10]	ZHU Ruo-chen, YANG Chang-chun, ZHANG Deng-hui. EGOS-DST:Efficient Schema-guided Approach to One-step Dialogue State Tracking for Diverse Expressions [J]. Computer Science, 2022, 49(11A): 210900246-7.
[11]	HUANG Jia-wei, LI Xiao-peng, LING Cheng. Scalable Parallel Computing Method for Conditional Likelihood Probability of Nucleotide Molecular Phylogenetic Tree Based on GPU [J]. Computer Science, 2022, 49(11A): 210800189-7.
[12]	FU Tian-hao, TIAN Hong-yun, JIN Yu-yang, YANG Zhang, ZHAI Ji-dong, WU Lin-ping, XU Xiao-wen. Performance Skeleton Analysis Method Towards Component-based Parallel Applications [J]. Computer Science, 2021, 48(6): 1-9.
[13]	HE Ya-ru, PANG Jian-min, XU Jin-long, ZHU Yu, TAO Xiao-han. Implementation and Optimization of Floyd Parallel Algorithm Based on Sunway Platform [J]. Computer Science, 2021, 48(6): 34-40.
[14]	LI Fan, YAN Xing, ZHANG Xiao-yu. Optimization of GPU-based Eigenface Algorithm [J]. Computer Science, 2021, 48(4): 197-204.
[15]	HU Rong, YANG Wang-dong, WANG Hao-tian, LUO Hui-zhang, LI Ken-li. Parallel WMD Algorithm Based on GPU Acceleration [J]. Computer Science, 2021, 48(12): 24-28.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Multi Level Parallel Computing for SW26010 Discontinuous Galerkin Finite Element Algorithm

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0