计算机科学 ›› 2020, Vol. 47 ›› Issue (11A): 624-627.doi: 10.11896/jsjkx.191100154
许新鹏1, 胡斌星2
XU Xin-peng1, HU Bin-xing2
摘要: 为满足可重复使用飞行器结构故障快速校核计算的求解要求,以GPU(Graphics Processing Unit)作为协处理器,利用其高度并行化、高显存带宽的优势完成稀疏线性方程组的加速求解。鉴于线性方程组的求解最为耗时,采用不完全Cholesky分解的共轭梯度法(ICCG)完成机翼算例的计算,在GTX1060显卡上较E3 1230V5有最高约25倍的加速比。结果表明,基于CUDA的ICCG算法能够满足至少60 000阶矩阵的飞行器有限元模型的相关计算。
中图分类号:
[1] LI J W.Modeling,Design and Analysis of Large Strap-onLaunch Vehicle's Attitude Control System[D].Graduate School of National University of Defense Technology,Changsha,Hunan,P.R.China. April,2011 . [2] 杨超,许赟,谢长川.高超声速飞行器气动弹性力学研究综述[J].航空学报,2010,31(1):1-11. [3] 张希彬,宗群,曾凡琳.考虑气动—推进—弹性耦合的高超声速飞行器面向控制建模与分析[J].宇航学报,2014,35(5):528-536. [4] RAVISHANKAR B,HAFTKA R,SANKAR B.Homogenization of Integrated Thermal Protection System with Rigid Insulation Bars[C]//52nd AIAA Structures,Structural Dynamics and Materials Conference.Denver,Colorado,2012. [5] 彭小波.可重复使用新型航天飞行器结构设计[M].北京:中国宇航出版社,2015. [6] 周树荃,梁维泰,邓绍忠.计算方法丛书——有限元结构分析并行计算[M].北京:科学出版社,1999. [7] 董廷星,李新亮,李森,等.GPU上计算流体力学的加速[J].计算机系统应用,2011,20(01):104-109. [8] NGUYEN T D,SCHMIDT B,ZHENG Z,et al.Efficient andAccurate OTU Clustering with GPU-Based Sequence Alignment and Dynamic Dendrogram Cutting[J].IEEE/ACM Trans Comput Biol Bioinform,2015,12(5):1060-1073. [9] 郑经纬,安雪晖,黄绵松.基于CUDA的大规模稀疏矩阵的PCG算法优化[J].清华大学学报(自然科学版),2014,54(8):1006-1012. [10] ZELEWSKI A K,ZIENIUK E,KAPTURCZAK M.Accelera-tion of integration in parametric integral equations system using CUDA[J].Computers and Structures,2015,152:113-124. [11] BARTEZZAGHI A,CREMONESI M,PAROLINI N,et al.An explicit dynamics GPU structural solver for thin shell finite elements[J].Computers & Structures,2015,154:29-40. [12] LACERDA SILVA G R,De MEDEIROS R R,JAIMES B R A,et al.CUDA-Based Parallelization of Power Iteration Clustering for Large Datasets[J].IEEE Access,2017,5:27263-27271. [13] 谷同祥,安恒斌,刘兴平.迭代方法和预处理技术(上册)[M].北京:科学出版社,2015. [14] COOK S.CUDA Programming:A Developer's Guide to Parallel Computing with GPUs[M].340 Pine Street,Sixth Floor,San Francisco,CA:Morgan Kaufmann Publishers Inc.,2012. [15] KAMEARI A.Improvement of ICCG Convergence for Thin Ele-ments in Magnetic Field Analyses Using the Finite-Element Method[J].IEEE Transactions on Magnetics,2008,44(6):1178-1181. [16] VOLKOV V,BARBIERI D,HOGG J,et al.CUBLAS Library User Guide,v10.1th ed[OL].Santa Clara,CA:NVIDIA,2018.https://docs.nvidia.c-om/pdf/CUBLAS_Library.pdf. [17] CHANG L W,VALERO-LARA P,MARTÍNEZ-PÉREZ I.CUSPARSE Library,v10.1th ed[OL].Santa Clara,CA:2018.https://docs.nvidia.com/pdf/CUSPARSE_Library.pdf. [18] BELL N,GARLAND M.Efficient Sparse Matrix-Vector Multiplication on CUDA,NVR-2008-004[R].Santa Clara,CA:NVIDIA,2008. [19] LIN S,XIE Z.A Jacobi_PCG solver for sparse linear systems on multi-GPU cluster[J].The Journal of Supercomputing,2017,73(1):433-454. [20] SANDERS J,KANDROT E.CUDA By Example An Introduction To General-Purpose GPU Programming[M].Addison-Wesley Professional,2010:38-46. [21] GEORGE A V,MANOJ S,GUPTE S R,et al.Thrust++:Extending Thrust Framework for Better Abstraction and Performance[C]//2017 IEEE 24th International Conference on High Performance Computing (HIPC).IEEE,2017. [22] REAÑO C,SILLA F.On the support of inter-node P2P GPUmemory copies in rCUDA[J].Journal of Parallel and Distributed Computing,2019,127(5):28-43. [23] FANG Y,CHEN Q.A real-time and reliable dynamic migration model for concurrent taskflow in a GPU cluster[J].Cluster Computing,2019,22(2):585-599. |
[1] | 汪晋, 刘江. 基于GPU的并行DILU预处理技术 GPU-based Parallel DILU Preconditioning Technique 计算机科学, 2022, 49(6): 108-118. https://doi.org/10.11896/jsjkx.210300259 |
[2] | 胡蓉, 阳王东, 王昊天, 罗辉章, 李肯立. 基于GPU加速的并行WMD算法 Parallel WMD Algorithm Based on GPU Acceleration 计算机科学, 2021, 48(12): 24-28. https://doi.org/10.11896/jsjkx.210600213 |
[3] | 文敏华, 汪申鹏, 韦建文, 李林颖, 张斌, 林新华. 基于DGX-2的湍流燃烧问题优化研究 DGX-2 Based Optimization of Application for Turbulent Combustion 计算机科学, 2021, 48(12): 43-48. https://doi.org/10.11896/jsjkx.201200129 |
[4] | 汪亮, 周新志, 严华. 基于GPU的实时SIFT算法 Real-time SIFT Algorithm Based on GPU 计算机科学, 2020, 47(8): 105-111. https://doi.org/10.11896/jsjkx.190700036 |
[5] | 郑红波, 石豪, 杜轶诚, 张美玉, 秦绪佳. 光照不均匀的结构光图像的条纹快速提取方法 Fast Stripe Extraction Method for Structured Light Images with Uneven Illumination 计算机科学, 2019, 46(5): 272-278. https://doi.org/10.11896/j.issn.1002-137X.2019.05.042 |
[6] | 程东升,刘志勇,薛国伟,高月芳. 一种针对大波数Helmholtz方程的高性能并行预条件迭代求解算法 High-performance Parallel Preconditioned Iterative Solver for Helmholtz Equation with Large Wavenumbers 计算机科学, 2018, 45(7): 299-306. https://doi.org/10.11896/j.issn.1002-137X.2018.07.051 |
[7] | 张劼,文敏华,林新华,孟德龙,陆豪. 基于历史模拟法的风险价值算法在GPU上的实现和优化 Implementation and Optimization of Historical VaR on GPU 计算机科学, 2018, 45(5): 291-294. https://doi.org/10.11896/j.issn.1002-137X.2018.05.050 |
[8] | 周筠, 蒋富. 基于CUDA架构的改进Marching Cubes算法 Improved Marching Cubes Based on CUDA 计算机科学, 2018, 45(11A): 573-575. |
[9] | 刘端阳, 郑江帆, 沈国江, 刘志. 基于CUDA的k-means算法并行化研究 Study on Parallel K-means Algorithm Based on CUDA 计算机科学, 2018, 45(11): 292-297. https://doi.org/10.11896/j.issn.1002-137X.2018.11.047 |
[10] | 武昱, 闫光辉, 王雅斐, 马青青, 刘宇轩. 结合GPU技术的并行CP张量分解算法 Parallel CP Tensor Decomposition Algorithm Combining with GPU Technology 计算机科学, 2018, 45(11): 298-303. https://doi.org/10.11896/j.issn.1002-137X.2018.11.048 |
[11] | 徐启航,游安清,马社,崔云俊. 基本图像处理算法的优化过程研究 Study on Optimizations of Basic Image Processing Algorithm 计算机科学, 2017, 44(Z6): 169-172. https://doi.org/10.11896/j.issn.1002-137X.2017.6A.039 |
[12] | 尹孟嘉,许先斌,何水兵,胡婧,叶从欢,张涛. GPU稀疏矩阵向量乘的性能模型构造 Performance Model of Sparse Matrix Vector Multiplication on GPU 计算机科学, 2017, 44(4): 182-187. https://doi.org/10.11896/j.issn.1002-137X.2017.04.040 |
[13] | 沈洪,李晓光. 图像显著估计的并行算法研究 Research on Parallel Algorithm of Image Saliency Estimation 计算机科学, 2017, 44(12): 266-273. https://doi.org/10.11896/j.issn.1002-137X.2017.12.048 |
[14] | 韦博文,李涛,李广宇,汪致恒,何沐,师悦龄,刘路遥,张瑞. 使用OpenCL技术的影像快速畸变纠正方法在异构平台上的应用分析 Applied Analysis of Image Accelerating Distortion Correction of OpenCL Technology on Heterogeneous Platform 计算机科学, 2016, 43(Z11): 167-169. https://doi.org/10.11896/j.issn.1002-137X.2016.11A.036 |
[15] | 潘茜,张育平,陈海燕. 基于CUDA的并行K-近邻连接算法实现 Implementation of Parallel K-Nearest Neighbor Join Algorithm Based on CUDA 计算机科学, 2016, 43(10): 190-192. https://doi.org/10.11896/j.issn.1002-137X.2016.10.035 |
|