计算机科学 ›› 2021, Vol. 48 ›› Issue (6): 34-40.doi: 10.11896/jsjkx.201100051
何亚茹1, 庞建民1,2, 徐金龙2, 朱雨2, 陶小涵2
HE Ya-ru1, PANG Jian-min1,2, XU Jin-long2, ZHU Yu2, TAO Xiao-han2
摘要: 求解全源最短路径的Floyd算法是许多实际应用基础上的关键构建块,由于其时间复杂度较高,串行Floyd算法不适用于大规模输入图计算,针对不同平台的并行Floyd算法设计可为解决现实问题提供有效帮助。针对Floyd算法与国产自主研发处理器匹配滞后的问题,首次提出基于神威平台的Floyd并行算法的实现和优化。根据SW26010处理器主-从核架构的特点,采用主从加速编程模型进行并行实现,并分析了影响该算法性能的关键因素,通过算法优化、数组划分和双缓冲技术进行优化,逐步提升算法性能。测试结果表明,与主核上串行算法相比,基于神威平台的Floyd并行算法在单个SW26010处理器上可以获得106倍的最高加速。
中图分类号:
[1]DOR D,HALPERIN S,ZWICK U.All pairs almost shortest paths[C]//SIAM Journal on Computing.1996,29(5):452-461. [2]CHAN T M.More algorithms for all-pairs shortest paths inweighted graphs[J].Proceedings of the Annual ACM Sympo-sium on Theory of Computing,2010,39:2075-2089. [3]LI J W,ZHANG J,ZHAO J C,et al.A Load Balancing Shortest Path Routing Algorithm for SRIO Network[J].Computer Engineering,2020,46(3):214-221,228. [4]DONGARRA J.Report on the sunway taihulight system[D].Knoxville:University of Tennessee,2016. [5]LIU X,GUO H,SUN R J,et al.The Characteristic Analysis and Exascale Scalability Research of Large Scale Parallel Applications on Sunway TaihuLight Supercomputer[J].Chinese Journal of Computers,2018,41(10):2209-2220. [6]LI M,YANG C,SUN Q,et al.Enabling highly efficient k-Means computations on the SW26010 many-core processor of sunway taihulight[J].Journal of Computer Science & Technology,2019,34(1):77-93. [7]NI H,LIU X.Multi-Core optimization technology of unstructured grid based on Sunway TaihuLight[J].Computer Engineering,2019,45(6):45-51. [8]XU Z,LIN J,MATSUOKA S.Benchmarking SW26010 many-core processor[C]//Parallel & Distributed Processing Sympo-sium Workshops.IEEE,2017. [9]FLOYD R W.Algorithm 97,Shortest path algorithms[J].Communications of the ACM,1962,5(6):345. [10]VENKATARAMAN G,SAHNI S,MUKHOPADHYAYA S.A blocked all-pairs shortest-paths algorithm[C]//Scandinavian Workshop on Algorithm Theory.Berlin,Heidelberg:Springer,2000. [11]ZHANG D Q,WU G L,LIU D F.Accelerated and OptimizedMethod of Floyd Algorithm to Find out Shortest Path[J].Computer Engineering and Applications,2009(17):45-47,50. [12]ZUO X F,SHEN W J.Improved Algorithm about Multi-shortest Path Problem Based on Floyd Algorithm[J].Computer Science,2017,44(5):238-240,273. [13]LU L G,LIU L Y,LU T D,et al.A Modified Floyd Algorithm[J].Journal of East China University of Technology,2019,42(1):81-84. [14]SRINIVASAN T,BALAKRISHNAN R,GANGADHARAN S A,et al.A scalable parallelization of all-pairs shortest path algorithm for a high performance cluster environment[C]//International Conference on Parallel & Distributed Systems.IEEE,2007. [15]TESKEREDZIC E,KARAHODZIC K,NOSOVIC N.Comparison of the non-blocked and blocked floyd-warshall algorithm with regard to speedup and energy saving on an embedded GPU[C]//19th International Symposium INFOTEH-JAHORINA.2020:18-20. [16]BONDHUGULA U,DEVULAPALLI A,FERNANDO J,et al.Parallel FPGA-based all-pairs shortest-paths in a directed graph[C]//20th International Parallel and Distributed Processing Symposium(IPDPS 2006).IEEE,2006. [17]XING X X,ZHAO G X,LUO Z Y,et al.GPU-based Algorithm of Shortest Path[J].Computer Science,2012,39(3):299-303. [18]FOSTER I.Designing and building parallel programs:concepts and tools for parallel software engineering[J].Tetrahedron Letters,1995,11(3):296-300. |
[1] | 陈鑫, 李芳, 丁海昕, 孙唯哲, 刘鑫, 陈德训, 叶跃进, 何香. 面向国产异构众核架构的CFD非结构网格计算并行优化方法 Parallel Optimization Method of Unstructured-grid Computing in CFD for DomesticHeterogeneous Many-core Architecture 计算机科学, 2022, 49(6): 99-107. https://doi.org/10.11896/jsjkx.210400157 |
[2] | 傅天豪, 田鸿运, 金煜阳, 杨章, 翟季冬, 武林平, 徐小文. 一种面向构件化并行应用程序的性能骨架分析方法 Performance Skeleton Analysis Method Towards Component-based Parallel Applications 计算机科学, 2021, 48(6): 1-9. https://doi.org/10.11896/jsjkx.201200115 |
[3] | 朱雨, 庞建民, 徐金龙, 陶小涵, 王军. 面向SW26010处理器的三维Stencil自适应分块参数算法 Adaptive Tiling Size Algorithm for 3D Stencil Computation on SW26010 Many-core Processor 计算机科学, 2021, 48(6): 10-18. https://doi.org/10.11896/jsjkx.200700059 |
[4] | 冯凯, 马鑫玉. (n,k)-冒泡排序网络的子网络可靠性 Subnetwork Reliability of (n,k)-bubble-sort Networks 计算机科学, 2021, 48(4): 43-48. https://doi.org/10.11896/jsjkx.201100139 |
[5] | 胡蓉, 阳王东, 王昊天, 罗辉章, 李肯立. 基于GPU加速的并行WMD算法 Parallel WMD Algorithm Based on GPU Acceleration 计算机科学, 2021, 48(12): 24-28. https://doi.org/10.11896/jsjkx.210600213 |
[6] | 马梦宇, 吴烨, 陈荦, 伍江江, 李军, 景宁. 显示导向型的大规模地理矢量实时可视化技术 Display-oriented Data Visualization Technique for Large-scale Geographic Vector Data 计算机科学, 2020, 47(9): 117-122. https://doi.org/10.11896/jsjkx.190800121 |
[7] | 陈国良, 张玉杰. 并行计算学科发展历程 Development of Parallel Computing Subject 计算机科学, 2020, 47(8): 1-4. https://doi.org/10.11896/jsjkx.200600027 |
[8] | 阳王东, 王昊天, 张宇峰, 林圣乐, 蔡沁耘. 异构混合并行计算综述 Survey of Heterogeneous Hybrid Parallel Computing 计算机科学, 2020, 47(8): 5-16. https://doi.org/10.11896/jsjkx.200600045 |
[9] | 袁欣辉, 林蓉芬, 魏迪, 尹万旺, 徐金秀. 面向国产异构众核处理器SW26010的BFS优化方法 Optimization of BFS on Domestic Heterogeneous Many-core Processor SW26010 计算机科学, 2020, 47(8): 98-104. https://doi.org/10.11896/jsjkx.191000013 |
[10] | 冯凯, 李婧. k元n方体的子网络可靠性研究 Study on Subnetwork Reliability of k-ary n-cubes 计算机科学, 2020, 47(7): 31-36. https://doi.org/10.11896/jsjkx.190700170 |
[11] | 吕小敬, 刘钊, 褚学森, 石树鹏, 孟虹松, 黄震春. 面向超大规模并行模拟的LBM计算流体力学软件 Extreme-scale Simulation Based LBM Computing Fluid Dynamics Simulations 计算机科学, 2020, 47(4): 13-17. https://doi.org/10.11896/jsjkx.191000010 |
[12] | 杨宗霖, 李天瑞, 刘胜久, 殷成凤, 贾真, 珠杰. 基于Spark Streaming的流式并行文本校对 Streaming Parallel Text Proofreading Based on Spark Streaming 计算机科学, 2020, 47(4): 36-41. https://doi.org/10.11896/jsjkx.190300070 |
[13] | 邓定胜. 一种改进的DBSCAN算法在Spark平台上的应用 Application of Improved DBSCAN Algorithm on Spark Platform 计算机科学, 2020, 47(11A): 425-429. https://doi.org/10.11896/jsjkx.190700071 |
[14] | 徐传福,王曦,刘舒,陈世钊,林玉. 基于Python的大规模高性能LBM多相流模拟 Large-scale High-performance Lattice Boltzmann Multi-phase Flow Simulations Based on Python 计算机科学, 2020, 47(1): 17-23. https://doi.org/10.11896/jsjkx.190500009 |
[15] | 徐磊, 陈荣亮, 蔡小川. 基于非结构化网格的高可扩展并行有限体积格子 Scalable Parallel Finite Volume Lattice Boltzmann Method Based on Unstructured Grid 计算机科学, 2019, 46(8): 84-88. https://doi.org/10.11896/j.issn.1002-137X.2019.08.013 |
|