面向异构架构的混合精度有限元算法及其CUDA实现

计算机科学 ›› 2012, Vol. 39 ›› Issue (6): 293-296.

面向异构架构的混合精度有限元算法及其CUDA实现

刘建华,王朝尉,任江勇,田荣

(中国科学院计算技术研究所高性能计算机研究中心北京100190)

出版日期:2018-11-16 发布日期:2018-11-16

Mixed Precision Finite Element Algorithm on Heterogeneous Architecture

Online:2018-11-16 Published:2018-11-16

摘要/Abstract

摘要： 长期以来，单精度似乎与科学计算无缘，然而从体系结构看，混合精度计算可以充分发挥向量部件、C}PGPU 设备的单精度性能，提供更高的效能，如降低通讯带宽要求、提高数据传输和通讯效率等。混合精度显格式有限元算法，结合材料强非线性多尺度有限元程序msFEM，实现了GPGPU上的有效加速。实验结果表明:混合精度显格式有限元程序实现了90%以上的计算通过单精度完成，其计算结果与全部使用双精度的结果相一致。该算法可以使得在不支持双精度格式的加速卡上实现科学计算功能。在支持双精度浮点格式的GPU上，混合精度算法与全部采用双精度计算相比其加速效果提高了1. 6~1. 7倍。

关键词: GPGPU,混合精度算法，有限元，并行计算

Abstract: For a long time, single precision has been giving away to double precision in scientific computing. However, on computer architectures, mixed-precision computing, can take full advantages of excellent computing compatibilitics of vector components, GPGPU, offering merits such as reducing communication bandwidth requirements, improving data movement efficiency etc. A mixed-precision explicit finitcclement algorithm was proposed and implemented on nVidia GPU for strongly nonlinear multi scale material simulation. I}he developed mixed-precision finitcelement method gives the same results as that of the fully double-precision calculation, while keeping a 90 0 o portion of finite element calcula- lions to be done by single precision float calculation. As a result, on the device that does not support native double preci- sion float format, the mixed-precision algorithm makes it possible to fulfill double precision finite element simulation, while on the device that supports the native double precision, the mixed-precision algorithm is 1. 6一1. 7 times faster than the full double precision calculation.

Key words: GPGPU, Mixed precision algorithm, Finite element method, Parallel computing

刘建华,王朝尉,任江勇,田荣. 面向异构架构的混合精度有限元算法及其CUDA实现[J]. 计算机科学, 2012, 39(6): 293-296. https://doi.org/

参考文献

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed