计算机科学 ›› 2012, Vol. 39 ›› Issue (3): 286-289.

• 体系结构 • 上一篇    下一篇

基于CUDA实现MRRR算法并行

汪丽杰,赵永华   

  1. (中国科学院计算机网络信息中心超算中心 北京 100190)
  • 出版日期:2018-11-16 发布日期:2018-11-16

Parallel Realization of the MRRR Algorithm Based on CUDA

  • Online:2018-11-16 Published:2018-11-16

摘要: MRRR(Multiplc Rclativcly Robust Rcprcscntations)算法是求解对称三对角矩阵本征值问题高效、精确的算 法之一。在分析MRRR算法及CUDA(Compute Unified Device Architecture)并行体系结构的基础上,针对算法的可 并行性,采用单指令多线程并行方式实现了基于CUD八的MRRR算法并行,并从存储结构方面优化算法。实验结果 显示,与LAPACK库中串行MRRR实现相比,并行方法在保证精度的基础上获得了20倍的加速比,进而从计算精度 和计算时间上说明MRRR算法适合在GPU上并行。

关键词: MRRR,并行,CUDA,本征问题

Abstract: The algorithm of multiple relatively robust representations(MRRR) is one of the fastest and most accurate algorithms. After analyzing the MRRR algorithm and CUDA parallel architecture, parallel MRRR algorithm based on CUDA was given, and explored the optimization in memory structure. Compared with LAPACK's MRRR implementa- lion this parallel method provides 20-fold speedups. This result demonstrates the algorithm can be mapped efficiently onto GPU.

Key words: MRRR, Parallel, CUDA, Eigenproblem

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!