计算机科学 ›› 2021, Vol. 48 ›› Issue (6): 34-40.doi: 10.11896/jsjkx.201100051
何亚茹1, 庞建民1,2, 徐金龙2, 朱雨2, 陶小涵2
HE Ya-ru1, PANG Jian-min1,2, XU Jin-long2, ZHU Yu2, TAO Xiao-han2
摘要: 求解全源最短路径的Floyd算法是许多实际应用基础上的关键构建块,由于其时间复杂度较高,串行Floyd算法不适用于大规模输入图计算,针对不同平台的并行Floyd算法设计可为解决现实问题提供有效帮助。针对Floyd算法与国产自主研发处理器匹配滞后的问题,首次提出基于神威平台的Floyd并行算法的实现和优化。根据SW26010处理器主-从核架构的特点,采用主从加速编程模型进行并行实现,并分析了影响该算法性能的关键因素,通过算法优化、数组划分和双缓冲技术进行优化,逐步提升算法性能。测试结果表明,与主核上串行算法相比,基于神威平台的Floyd并行算法在单个SW26010处理器上可以获得106倍的最高加速。
中图分类号:
[1] | DOR D,HALPERIN S,ZWICK U.All pairs almost shortest paths[C]//SIAM Journal on Computing.1996,29(5):452-461. |
[2] | CHAN T M.More algorithms for all-pairs shortest paths inweighted graphs[J].Proceedings of the Annual ACM Sympo-sium on Theory of Computing,2010,39:2075-2089. |
[3] | LI J W,ZHANG J,ZHAO J C,et al.A Load Balancing Shortest Path Routing Algorithm for SRIO Network[J].Computer Engineering,2020,46(3):214-221,228. |
[4] | DONGARRA J.Report on the sunway taihulight system[D].Knoxville:University of Tennessee,2016. |
[5] | LIU X,GUO H,SUN R J,et al.The Characteristic Analysis and Exascale Scalability Research of Large Scale Parallel Applications on Sunway TaihuLight Supercomputer[J].Chinese Journal of Computers,2018,41(10):2209-2220. |
[6] | LI M,YANG C,SUN Q,et al.Enabling highly efficient k-Means computations on the SW26010 many-core processor of sunway taihulight[J].Journal of Computer Science & Technology,2019,34(1):77-93. |
[7] | NI H,LIU X.Multi-Core optimization technology of unstructured grid based on Sunway TaihuLight[J].Computer Engineering,2019,45(6):45-51. |
[8] | XU Z,LIN J,MATSUOKA S.Benchmarking SW26010 many-core processor[C]//Parallel & Distributed Processing Sympo-sium Workshops.IEEE,2017. |
[9] | FLOYD R W.Algorithm 97,Shortest path algorithms[J].Communications of the ACM,1962,5(6):345. |
[10] | VENKATARAMAN G,SAHNI S,MUKHOPADHYAYA S.A blocked all-pairs shortest-paths algorithm[C]//Scandinavian Workshop on Algorithm Theory.Berlin,Heidelberg:Springer,2000. |
[11] | ZHANG D Q,WU G L,LIU D F.Accelerated and OptimizedMethod of Floyd Algorithm to Find out Shortest Path[J].Computer Engineering and Applications,2009(17):45-47,50. |
[12] | ZUO X F,SHEN W J.Improved Algorithm about Multi-shortest Path Problem Based on Floyd Algorithm[J].Computer Science,2017,44(5):238-240,273. |
[13] | LU L G,LIU L Y,LU T D,et al.A Modified Floyd Algorithm[J].Journal of East China University of Technology,2019,42(1):81-84. |
[14] | SRINIVASAN T,BALAKRISHNAN R,GANGADHARAN S A,et al.A scalable parallelization of all-pairs shortest path algorithm for a high performance cluster environment[C]//International Conference on Parallel & Distributed Systems.IEEE,2007. |
[15] | TESKEREDZIC E,KARAHODZIC K,NOSOVIC N.Comparison of the non-blocked and blocked floyd-warshall algorithm with regard to speedup and energy saving on an embedded GPU[C]//19th International Symposium INFOTEH-JAHORINA.2020:18-20. |
[16] | BONDHUGULA U,DEVULAPALLI A,FERNANDO J,et al.Parallel FPGA-based all-pairs shortest-paths in a directed graph[C]//20th International Parallel and Distributed Processing Symposium(IPDPS 2006).IEEE,2006. |
[17] | XING X X,ZHAO G X,LUO Z Y,et al.GPU-based Algorithm of Shortest Path[J].Computer Science,2012,39(3):299-303. |
[18] | FOSTER I.Designing and building parallel programs:concepts and tools for parallel software engineering[J].Tetrahedron Letters,1995,11(3):296-300. |
[1] | 傅天豪, 田鸿运, 金煜阳, 杨章, 翟季冬, 武林平, 徐小文. 一种面向构件化并行应用程序的性能骨架分析方法[J]. 计算机科学, 2021, 48(6): 1-9. |
[2] | 朱雨, 庞建民, 徐金龙, 陶小涵, 王军. 面向SW26010处理器的三维Stencil自适应分块参数算法[J]. 计算机科学, 2021, 48(6): 10-18. |
[3] | 冯凯, 马鑫玉. (n,k)-冒泡排序网络的子网络可靠性[J]. 计算机科学, 2021, 48(4): 43-48. |
[4] | 马梦宇, 吴烨, 陈荦, 伍江江, 李军, 景宁. 显示导向型的大规模地理矢量实时可视化技术[J]. 计算机科学, 2020, 47(9): 117-122. |
[5] | 陈国良, 张玉杰. 并行计算学科发展历程[J]. 计算机科学, 2020, 47(8): 1-4. |
[6] | 阳王东, 王昊天, 张宇峰, 林圣乐, 蔡沁耘. 异构混合并行计算综述[J]. 计算机科学, 2020, 47(8): 5-16. |
[7] | 袁欣辉, 林蓉芬, 魏迪, 尹万旺, 徐金秀. 面向国产异构众核处理器SW26010的BFS优化方法[J]. 计算机科学, 2020, 47(8): 98-104. |
[8] | 冯凯, 李婧. k元n方体的子网络可靠性研究[J]. 计算机科学, 2020, 47(7): 31-36. |
[9] | 吕小敬, 刘钊, 褚学森, 石树鹏, 孟虹松, 黄震春. 面向超大规模并行模拟的LBM计算流体力学软件[J]. 计算机科学, 2020, 47(4): 13-17. |
[10] | 杨宗霖, 李天瑞, 刘胜久, 殷成凤, 贾真, 珠杰. 基于Spark Streaming的流式并行文本校对[J]. 计算机科学, 2020, 47(4): 36-41. |
[11] | 邓定胜. 一种改进的DBSCAN算法在Spark平台上的应用[J]. 计算机科学, 2020, 47(11A): 425-429. |
[12] | 徐传福,王曦,刘舒,陈世钊,林玉. 基于Python的大规模高性能LBM多相流模拟[J]. 计算机科学, 2020, 47(1): 17-23. |
[13] | 徐磊, 陈荣亮, 蔡小川. 基于非结构化网格的高可扩展并行有限体积格子[J]. 计算机科学, 2019, 46(8): 84-88. |
[14] | 倪鸿, 刘鑫. 非结构网格下稀疏下三角方程求解器众核优化技术研究[J]. 计算机科学, 2019, 46(6A): 518-522. |
[15] | 陶小涵, 庞建民, 高伟, 王琦, 姚金阳. 基于SW26010处理器的FT程序的性能优化[J]. 计算机科学, 2019, 46(4): 321-328. |
|