Computer Science ›› 2022, Vol. 49 ›› Issue (6): 99-107.doi: 10.11896/jsjkx.210400157

• High Performance Computing • Previous Articles     Next Articles

Parallel Optimization Method of Unstructured-grid Computing in CFD for DomesticHeterogeneous Many-core Architecture

CHEN Xin1, LI Fang1, DING Hai-xin2, SUN Wei-ze1, LIU Xin1, CHEN De-xun1, YE Yue-jin1, HE Xiang1   

  1. 1 National Super Computing Center in Wuxi,Wuxi,Jiangsu 214000,China
    2 China Aerodynamics Research and Development Center,Mianyang,Sichuan 621000,China
  • Received:2021-04-15 Revised:2021-07-15 Online:2022-06-15 Published:2022-06-08
  • About author:CHEN Xin,born in 1994,master,research assistant.His main research interests include computational fluid dynamics and high-performance parallel computation and application.
    LI Fang,born in 1980,Ph.D,associate researcher.Her main research interests include computational fluid dynamics and high-performance parallel computation and application.
  • Supported by:
    National Key Research and Development Project of China(2016YFB0201100) and National Science and Techno-logy Major Project (2017-I-0004-0004).

Abstract: Sunway TaihuLight ranked first in the global supercomputer top 500 list 2016-2018 with a peak performance of 125.4 PFlops.Its computing power is mainly attributed to the domestic SW26010 many-core RISC processor.CFD unstructured-grid computing has always been a challenge for porting and optimizing in domestic many-core supercomputer,because of its complex topology,serious discrete memory access problems,and strongly correlated linear equation solution.In order to give fully play to the computing efficiency of domestic heterogeneous multi-core architecture,firstly,a data reconstruction model is proposed to improve the locality and parallelism of data,and the data structure is more suitable for the characteristics of multi-core architecture.Secondly,aiming at the discrete memory access problem caused by the disorder of unstructured-grid data storage,a discrete memory access optimization method based on prestorage of information relation is proposed,which transforms discrete memory access into continuous memory access.Finally,the pipeline parallelism mechanism in core array is introduced to realize many-core parallelism for solving linear equations with strong correlation.Experiments show that the overall performance of unstructured-grid computing in CFD is improved by more than 4 times,and is 1.2x faster than the general CPU.The computing cores scale to 624 000,and the parallelism efficiency is maintained at 64.5%.

Key words: Computational fluid dynamics, Heterogeneous many-core, Parallel computing, Sunway supercomputer, Unstructured-grid

CLC Number: 

  • TP311
[1] LIN C L,TAWHAI M H,MCLENNAN G,et al.Computational fluid dynamics[J].IEEE Engineering in Medicine & Biology Magazine,2009,28(3):25-33.
[2] XU K,MATHEMATICS D O.Direct modeling for computa-tional fluid dynamics[J].Acta Mechanica Sinica,2015,1(1):303-318.
[3] XU C F,DENG X G,ZHANG L L,et al.Parallelizing a High-Order CFD Software for 3D,Multi-block,Structural Grids on the TianHe-1A Supercomputer[C]//International Supercomputing Conference.Berlin,Heidelberg,2013:26-39.
[4] CORRIGAN A,CAMELLI F,LOHNER R,et al.Running unstructured grid based CFD solvers on modern graphics hardware[J].International Journal for Numerical Methods in Fluids,2011,66(2):221-229.
[5] ABBRUZZESE G,GÓMEZ M,CORDERO-GRACIA M,et al.Unstructured 2D grid generation using overset-mesh cutting and single-mesh reconstruction[J].Aerospace Science & Techno-logy,2018,78:637-647.
[6] JAHANDARI H,BIHLO A.Forward modelling of geophysical electromagnetic data on unstructured grids using an adaptive mimetic finite-difference method[J].Computational Geosciences,2021,25:1083-1104.
[7] CHEN S S,HUA Y,CAI F J,et al.Multi-dimensional dissipation strategy within advection upstream splitting methods in hypersonic flows[J].Journal of Physics:Conference Series,2021,1786(1):012050.
[8] DLA C,MP A,RL A,et al.Tracer transport within an unstructured grid ocean model using characteristic discontinuous Galerkin advection -ScienceDirect[J].Computers & Mathematics with Applications,2019,78(2):611-622.
[9] CAI X,ZHANG Y J,SHEN J,et al.A Numerical Study of Hypoxia in Chesapeake Bay Using an Unstructured Grid Model:Validation and Sensitivity to Bathymetry Representation[J].JAWRA Journal of the American Water Resources Association,2020,10:1-24.
[10] FUJITA K,HORIKOSHI M,ICHIMURA T,et al.Develop-ment of Element-by-Element Kernel Algorithms in Unstructured Finite-Element Solvers for Many-Core Wide-SIMD CPUs:Application to Earthquake Simulation[J].Journal of Computational Science,2020,45:1-11.
[11] SHARMA V,ESWARAN V,CHAKRABORTY D,et al.Determination of optimal spacing between transverse jets in a SCRAMJET engine[J].Aerospace Science and Technology,2020,96:1-12.
[12] LI F,LI Z H,XU J X.Research on Adaptation of CFD Software Based on Many-core Architecture of 100P Domestic Supercomputing System[J].Computer Science,2020,47(1):24-30.
[13] LI R,WANG X,ZHAO W B.A Multigrid Block LU-SGS Algorithm for Euler Equations on Unstructured Grids[J].Numerical Mathematics Theory Methods & Applications,2008,1(1):1-25.
[14] LI W,LUO L S.An implicit block LU-SGS finite-volume lattice-Boltzmann scheme for steady flows on arbitrary unstructured meshes[J].Journal of Computational Physics,2016,20(2):503-518.
[15] FU H H,LIAO J F,YANG J Z,et al.The Sunway Taihu Light supercomputer:system and applications[J].Science China(Information Sciences),2016,59(7):113-128.
[16] LIN H,TANG X,YU B,et al.Scalable Graph Traversal onSunway TaihuLight with Ten Million Cores[C]//2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS).IEEE,2017.
[17] LIN J,XU Z,NUKADA A.Optimizations of Two Computebound Scientific Kernelson SW26010 Manycore Processor[C]//Proceedings of the 46th International Conference on Parallel Processing.IEEE,2017.
[18] DONGARR J.Sunway TaihuLight supercomputer makes its appearance[J].National Science Review,2016,3(3):265-266.
[19] LIU X,LU L S,CHEN D X,et al.Research on Pre-processing Methods of Unstructured Grids[J].Computer Science,2012,39(3):308-311.
[20] MENG D L,WEN M H,WEI J W,et al.Porting and Optimizing OpenFOAM on Sunway TaihuLight System[J].Computer Science,2017,44(10):64-70.
[21] NI H,LIU X.Unstructured grid many-core optimization technology based on Sunway·Taihulight[J].Computer Enginee-ring,2019,45(6):45-51.
[22] XU T H.GPU implementation of compressible viscous flow numerical method based on unstructured mesh[D].Nanjing:Nanjing University of Aeronautics and Astronautics,2016.
[23] CHEN L,XU T H,TIAN S L.Research on GPU Acceleration of Implicit Schemes Based on Unstructured Grids[J].Computer System Application,2018,27(5):238-243.
[24] SINGH M,SINGH R,SINGH S,et al.Discrete Finite VolumeApproach for Multidimensional Agglomeration Population Ba-lance Equation on Unstructured Grid[J].Powder Technology,2020,376:229-240.
[25] ZHOU S,WEI W,GUO X.Notice of Retraction Unstructuredgrid finite volume method for NS equation[C]//International Conference on Computer Application & System Modeling.IEEE,2010.
[26] BOCHAROV A N,EVSTIGNEEV N M,RYABKOV O I.Fully implicit multiple graphics processing units’ schemes for hypersonic flows with lower upper symmetric Gauss-Seidel preconditioner on unstructured non-orthogonal grids[J].Journal of Physics:Conference Series,2020,1698(1):1-13.
[27] WANG L.Parallel Numerical Simulations of the Whole Scramjet Engine Flowfields on Unstructured grids[D].Mianyan:China Aerodynamics Research and Development Center,2007.
[28] HORSTMAN C,SETTLES G S,WILLIAMS D R,et al.A Reattaching Free Shear Layer in Compressible Turbulent Flow[J].AIAA Journal,1982,20(1):79-85.
[29] BYNUM M,BAURLE R.A Design of Experiments Study forthe HIFiRE Flight 2 Ground Test Computational Fluid Dyna-mics Results[C]//17th AIAA International Space Planes and Hypersonic Systems and Technologies Conference.2013.
[1] LIU Jiang, LIU Wen-bo, ZHANG Ju. Hybrid MPI+OpenMP Parallel Method on Polyhedral Grid Generation in OpenFoam [J]. Computer Science, 2022, 49(3): 3-10.
[2] FU Tian-hao, TIAN Hong-yun, JIN Yu-yang, YANG Zhang, ZHAI Ji-dong, WU Lin-ping, XU Xiao-wen. Performance Skeleton Analysis Method Towards Component-based Parallel Applications [J]. Computer Science, 2021, 48(6): 1-9.
[3] HE Ya-ru, PANG Jian-min, XU Jin-long, ZHU Yu, TAO Xiao-han. Implementation and Optimization of Floyd Parallel Algorithm Based on Sunway Platform [J]. Computer Science, 2021, 48(6): 34-40.
[4] LI Fan, YAN Xing, ZHANG Xiao-yu. Optimization of GPU-based Eigenface Algorithm [J]. Computer Science, 2021, 48(4): 197-204.
[5] HU Rong, YANG Wang-dong, WANG Hao-tian, LUO Hui-zhang, LI Ken-li. Parallel WMD Algorithm Based on GPU Acceleration [J]. Computer Science, 2021, 48(12): 24-28.
[6] MA Meng-yu, WU Ye, CHEN Luo, WU Jiang-jiang, LI Jun, JING Ning. Display-oriented Data Visualization Technique for Large-scale Geographic Vector Data [J]. Computer Science, 2020, 47(9): 117-122.
[7] CHEN Guo-liang, ZHANG Yu-jie, . Development of Parallel Computing Subject [J]. Computer Science, 2020, 47(8): 1-4.
[8] YANG Wang-dong, WANG Hao-tian, ZHANG Yu-feng, LIN Sheng-le, CAI Qin-yun. Survey of Heterogeneous Hybrid Parallel Computing [J]. Computer Science, 2020, 47(8): 5-16.
[9] GUO Jie, GAO Xi-ran, CHEN Li, FU You, LIU Ying. Parallelizing Multigrid Application Using Data-driven Programming Model [J]. Computer Science, 2020, 47(8): 32-40.
[10] YUAN Xin-hui, LIN Rong-fen, WEI Di, YIN Wan-wang, XU Jin-xiu. Optimization of BFS on Domestic Heterogeneous Many-core Processor SW26010 [J]. Computer Science, 2020, 47(8): 98-104.
[11] YANG Zong-lin, LI Tian-rui, LIU Sheng-jiu, YIN Cheng-feng, JIA Zhen, ZHU Jie. Streaming Parallel Text Proofreading Based on Spark Streaming [J]. Computer Science, 2020, 47(4): 36-41.
[12] DENG Ding-sheng. Application of Improved DBSCAN Algorithm on Spark Platform [J]. Computer Science, 2020, 47(11A): 425-429.
[13] XU Chuan-fu,WANG Xi,LIU Shu,CHEN Shi-zhao,LIN Yu. Large-scale High-performance Lattice Boltzmann Multi-phase Flow Simulations Based on Python [J]. Computer Science, 2020, 47(1): 17-23.
[14] LI Fang,LI Zhi-hui,XU Jin-xiu,FAN Hao,CHU Xue-sen,LI Xin-liang. Research on Adaptation of CFD Software Based on Many-core Architecture of 100P Domestic Supercomputing System [J]. Computer Science, 2020, 47(1): 24-30.
[15] XU Lei, CHEN Rong-liang, CAI Xiao-chuan. Scalable Parallel Finite Volume Lattice Boltzmann Method Based on Unstructured Grid [J]. Computer Science, 2019, 46(8): 84-88.
Full text



No Suggested Reading articles found!