计算机科学 ›› 2012, Vol. 39 ›› Issue (6): 293-296.

• 体系结构 • 上一篇    下一篇

面向异构架构的混合精度有限元算法及其CUDA实现

刘建华,王朝尉,任江勇,田荣   

  1. (中国科学院计算技术研究所高性能计算机研究中心 北京100190)
  • 出版日期:2018-11-16 发布日期:2018-11-16

Mixed Precision Finite Element Algorithm on Heterogeneous Architecture

  • Online:2018-11-16 Published:2018-11-16

摘要: 长期以来,单精度似乎与科学计算无缘,然而从体系结构看,混合精度计算可以充分发挥向量部件、C}PGPU 设备的单精度性能,提供更高的效能,如降低通讯带宽要求、提高数据传输和通讯效率等。混合精度显格式有限元算 法,结合材料强非线性多尺度有限元程序msFEM,实现了GPGPU上的有效加速。实验结果表明:混合精度显格式有 限元程序实现了90%以上的计算通过单精度完成,其计算结果与全部使用双精度的结果相一致。该算法可以使得在 不支持双精度格式的加速卡上实现科学计算功能。在支持双精度浮点格式的GPU上,混合精度算法与全部采用双精 度计算相比其加速效果提高了1. 6~1. 7倍。

关键词: GPGPU,混合精度算法,有限元,并行计算

Abstract: For a long time, single precision has been giving away to double precision in scientific computing. However, on computer architectures, mixed-precision computing, can take full advantages of excellent computing compatibilitics of vector components, GPGPU, offering merits such as reducing communication bandwidth requirements, improving data movement efficiency etc. A mixed-precision explicit finitcclement algorithm was proposed and implemented on nVidia GPU for strongly nonlinear multi scale material simulation. I}he developed mixed-precision finitcelement method gives the same results as that of the fully double-precision calculation, while keeping a 90 0 o portion of finite element calcula- lions to be done by single precision float calculation. As a result, on the device that does not support native double preci- sion float format, the mixed-precision algorithm makes it possible to fulfill double precision finite element simulation, while on the device that supports the native double precision, the mixed-precision algorithm is 1. 6一1. 7 times faster than the full double precision calculation.

Key words: GPGPU, Mixed precision algorithm, Finite element method, Parallel computing

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!