计算机科学 ›› 2020, Vol. 47 ›› Issue (4): 13-17.doi: 10.11896/jsjkx.191000010
吕小敬1, 刘钊2, 褚学森3, 石树鹏1, 孟虹松1, 黄震春2
LV Xiao-jing1, LIU Zhao2, CHU Xue-sen3, SHI Shu-peng1, MENG Hong-song1, HUANG Zhen-chun2
摘要: 格子玻尔兹曼方法(Lattice Boltzmann Method,LBM)是一种基于介观模拟尺度的计算流体力学方法,已被广泛用于理论研究和工程领域。提高LBM计算流体软件的并行模拟能力,是高性能计算及应用研究中的一项重要内容。该研究基于“神威·太湖之光”超级计算系统,设计并实现了一套高效扩展的LBM计算流体力学软件。针对国产众核处理器SW26010的架构,文中设计了以下几种提高SWLBM方针速度和可扩展性的多级并行技术,包括面向19点stencil的数据复用、碰撞过程向量化、主从异步并行通信计算隐藏等。基于以上并行优化方案,文中测试了高达56000亿网格的数值模拟,SWLBM软件持续浮点计算性能达到4.7PFlops,软件模拟速度提高了172倍。相比百万核心10000*10000*5000网格风场模拟,SWLBM整机千万核心的并行效率可达87%。测试结果表明,SWLBM有能力为工业应用提供实用的大规模并行模拟解决方案。
中图分类号:
[1]GAGLIANO A,NOCERA F,PATANIA F,et al.Assessment of micro-wind turbines performance in the urban environments:an aided methodology through geographical information systems[J].International Journal of Energy and Environmental Engineering,2013,4(1):43. [2]KRAUSE M J,GENGENBACH T,HEUVELINE V.Hybridparallel simulations of fluid flows in complex geometries:application to the human lungs[M]//Euro-Par 2010 Parallel Processing Workshops.Berlin:Springer,2011:209-216. [3]GÖTZ J,IGLBERGER K,STÜRMER M,et al.Direct numerical simulation of particulate flows on 294912 processor cores[C]//2010 ACM/IEEE International Conference for High Performance Computing,Networking,Storage and Analysis.New Orleans,LA,USA:IEEE,2010. [4]SCHORNBAUM F,RÜDE U.Massively parallel algorithms for the lattice boltzmann method on NonUniform grids[J].SIAM Journal on Scientific Computing,2016,38(2):C96-C126. [5]FIETZ J,KRAUSE M J,SCHULZ C,et al.Optimized hybridparallel lattice boltzmann fluid flow simulations on complex geometries[M]//Euro-Par 2012 Parallel Processing.Berlin:Springer,2012:818-829. [6]ONODERA N,AOKI T,SHIMOKAWABE T.Large-scale LES wind simulation using lattice Boltzmann method for a 10km× 10km area in metropolitan Tokyo[J].TSUBAME e-Science Journal Global Scientific Information and Computing Center,2003,9:1-8. [7]BAILEY P,MYRE J,WALSH S D C,et al.Accelerating lattice boltzmann fluid flow simulations using graphics processors[C]//2009 International Conference on Parallel Processing.Vienna:IEEE,2009. [8]CRIMI G,MANTOVANI F,PIVANTI M,et al.Early experience on porting and running a lattice boltzmann code on the xeon-phi Co-processor[J].Procedia Computer Science,2013,18:551-560. [9]YANG C,ZHENG W M,XUE W,et al.A peta-scalable CPU-GPU algorithm for global atmospheric simulations[J].ACM SIGPLAN Notices,2013,48(8):1. |
[1] | 朱雨, 庞建民, 徐金龙, 陶小涵, 王军. 面向SW26010处理器的三维Stencil自适应分块参数算法 Adaptive Tiling Size Algorithm for 3D Stencil Computation on SW26010 Many-core Processor 计算机科学, 2021, 48(6): 10-18. https://doi.org/10.11896/jsjkx.200700059 |
[2] | 何亚茹, 庞建民, 徐金龙, 朱雨, 陶小涵. 基于神威平台的Floyd并行算法的实现和优化 Implementation and Optimization of Floyd Parallel Algorithm Based on Sunway Platform 计算机科学, 2021, 48(6): 34-40. https://doi.org/10.11896/jsjkx.201100051 |
[3] | 袁欣辉, 林蓉芬, 魏迪, 尹万旺, 徐金秀. 面向国产异构众核处理器SW26010的BFS优化方法 Optimization of BFS on Domestic Heterogeneous Many-core Processor SW26010 计算机科学, 2020, 47(8): 98-104. https://doi.org/10.11896/jsjkx.191000013 |
[4] | 魏霖静, 宁璐璐, 郭斌, 侯振兴, 甘诗润. 基于混合蛙跳算法的K-mediods聚类挖掘与并行优化 K-mediods Cluster Mining and Parallel Optimization Based on Shuffled Frog Leaping Algorithm 计算机科学, 2020, 47(10): 126-129. https://doi.org/10.11896/jsjkx.190900113 |
[5] | 徐传福,王曦,刘舒,陈世钊,林玉. 基于Python的大规模高性能LBM多相流模拟 Large-scale High-performance Lattice Boltzmann Multi-phase Flow Simulations Based on Python 计算机科学, 2020, 47(1): 17-23. https://doi.org/10.11896/jsjkx.190500009 |
[6] | 杨思燕,贺国旗,刘如意. 基于SIFT算法的大场景视频拼接算法及优化 Video Stitching Algorithm Based on SIFT and Its Optimization 计算机科学, 2019, 46(7): 286-291. https://doi.org/10.11896/j.issn.1002-137X.2019.07.044 |
[7] | 倪鸿, 刘鑫. 非结构网格下稀疏下三角方程求解器众核优化技术研究 Many-core Optimization for Sparse Triangular Solver Under Unstructured Grids 计算机科学, 2019, 46(6A): 518-522. |
[8] | 陶小涵, 庞建民, 高伟, 王琦, 姚金阳. 基于SW26010处理器的FT程序的性能优化 Performance Optimization of FT Program Based on SW26010 Processor 计算机科学, 2019, 46(4): 321-328. https://doi.org/10.11896/j.issn.1002-137X.2019.04.050 |
[9] | 刘玉成, 理查德·丁, 张颖超. 一种BPNNs识别算法的医学检测泛实时性问题研究 Research on Pan-real-time Problem of Medical Detection Based on BPNNs Recognition Algorithm 计算机科学, 2018, 45(6): 301-307. https://doi.org/10.11896/j.issn.1002-137X.2018.06.053 |
[10] | 姜文超,林穗,王多强,李东明,金海. Calculix三级并行优化及其在天河二号超级计算机中的应用 Three-level Parallel Optimization and Application of Calculix in TH-2 Super-computing Environments 计算机科学, 2017, 44(3): 32-35. https://doi.org/10.11896/j.issn.1002-137X.2017.03.008 |
|