计算机科学 ›› 2019, Vol. 46 ›› Issue (1): 314-319.doi: 10.11896/j.issn.1002-137X.2019.01.049
徐启泽, 韩文廷, 陈俊仕, 安虹
XU Qi-ze, HAN Wen-ting, CHEN Jun-shi, AN Hong
摘要: 图算法在多个领域具有重要的应用价值。随着社会信息化程度的提高,需要处理的图数据量越来越大,图算法的性能已成为研究热点。广度优先搜索算法是一种重要的图算法,研究它的性能优化技术可以为其他图算法的性能优化提供借鉴。目前,在新一代Xeon Phi众核处理器上的工作均基于自顶向下算法且没有考虑到非均匀访存(NUMA)对性能的影响。文中以混合广度优先搜索算法为基础,结合NUMA拓扑结构,从任务分配、向量化和数据预处理3个方面展开优化,在Xeon Phi平台上设计并实现了高性能并行广度优先搜索算法。一系列实验结果表明,优化后的算法在不同规模的测试数据上与Graph500官方优化的算法相比取得了50%~145%的性能提升。
中图分类号:
[1]LIU H,HUANG H H.Enterprise:breadth-first graph traversal on GPUs[C]//SC15:International Conference for High Performance Computing,Networking,Storage and Analysis.2015:1-12.<br /> [2]BEAMER S,BULUC A,ASANOVIC K,et al.Distributed Memory Breadth-First Search Revisited:Enabling Bottom-Up Search[C]//2013 IEEE International Symposium on Parallel & Distributed Processing,Workshops and Phd Forum.2013:1618-1627.<br /> [3]LIN H,TANG X C,YU B W,et al.Scalable Graph Traversal on Sunway TaihuLight with Ten Million Cores[C]//International Parallel and Distributed Processing Symposium (IPDPS).2017:635-645.<br /> [4]PAREDES M,RILEY G,LUJAN M.Breadth first search vectorization on the Intel Xeon Phi[C]//Proceedings of the ACM International Conference on Computing Frontiers (CF’16).2016:1-10.<br /> [5]GAO T,LU Y T,ZHANG B D,et al.Using the Intel Many Integrated Core to accelerate graph traversal[J].International Journal of High Performance Computing Applications,2014,28(3):255-266<br /> [6]BESTA M,MARENDING F,SOLOMONIK E,et al.SlimSell:A Vectorizable Graph Representation for Breadth-First Search[C]//2017 IEEE International Parallel and Distributed Proces-sing Symposium (IPDPS).2017:32-41.<br /> [7]AGARWAL V,PETRINI F,PASETTO D,et al.Scalable Graph Exploration on Multicore Processors[C]//In Proceedings of the 2010 ACM/IEEE International Conference for High Perfor-mance Computing,Networking,Storage and Analysis (SC’10).2010:1-11.<br /> [8]YASUI Y,FUJISAWA K.Fast and scalable NUMA-based thread parallel breadth-first search[C]//2015 International Conference on High Performance Computing & Simulation (HPCS).2015:377-385.<br /> [9]Graph500[OL].https://graph500.org.<br /> [10]LESKOVEC J,CHAKRABARTI D,KLEINBERG J,et al.Kronecker Graphs:An Approach to Modeling Networks[J].The Journal of Machine Learning Research,2010,11(3):985-1042.<br /> [11]Sparse Matrix[OL].https://en.wikipedia.org/wiki/Sparse_ma-trix.<br /> [12]SODANI A.Knights landing (KNL):2nd Generation Intel<sup>®</sup> Xeon Phi processor[C]//2015 IEEE Hot Chips 27 Symposium (HCS).2015:1-24.<br /> [13]HENG D D,TANG Y H,YI X D,et al.Implementation and performance analysis of BFS algorithm on a parallel prototype system[J].Computer Engineering & Science,2017,39(1):27-34.(in Chinese)<br /> 衡冬冬,唐玉华,易晓东,等.并行原型系统上BFS算法设计实现与测试分析[J].计算机工程与科学,2017,39(1):27-34. |
[1] | 陈钧吾, 余华山. 面向无尺度图的Δ-stepping算法改进策略 Strategies for Improving Δ-stepping Algorithm on Scale-free Graphs 计算机科学, 2022, 49(6A): 594-600. https://doi.org/10.11896/jsjkx.210400062 |
[2] | 叶跃进, 李芳, 陈德训, 郭恒, 陈鑫. 基于国产众核架构的非结构网格分区块重构预处理算法研究 Study on Preprocessing Algorithm for Partition Reconnection of Unstructured-grid Based on Domestic Many-core Architecture 计算机科学, 2022, 49(6): 73-80. https://doi.org/10.11896/jsjkx.210900045 |
[3] | 鄂海红, 张田宇, 宋美娜. 基于Web的数据可视化图表渲染优化方法 Web-based Data Visualization Chart Rendering Optimization Method 计算机科学, 2021, 48(3): 119-123. https://doi.org/10.11896/jsjkx.200600038 |
[4] | 张晓, 张思蒙, 石佳, 董聪, 李战怀. Ceph分布式存储系统性能优化技术研究综述 Review on Performance Optimization of Ceph Distributed Storage System 计算机科学, 2021, 48(2): 1-12. https://doi.org/10.11896/jsjkx.201000149 |
[5] | 金琪, 王俊昌, 付雄. 基于智能放置策略的Cuckoo哈希表 Cuckoo Hash Table Based on Smart Placement Strategy 计算机科学, 2020, 47(8): 80-86. https://doi.org/10.11896/jsjkx.191200109 |
[6] | 徐江峰谭玉龙. 基于机器学习的HBase配置参数优化研究 Research on HBase Configuration Parameter Optimization Based on Machine Learning 计算机科学, 2020, 47(6A): 474-479. https://doi.org/10.11896/JsJkx.190900046 |
[7] | 张彭奕, 宋杰. 区块链共识算法效能优化研究进展 Research Advance on Efficiency Optimization of Blockchain Consensus Algorithms 计算机科学, 2020, 47(12): 296-303. https://doi.org/10.11896/jsjkx.200700020 |
[8] | 徐传福,王曦,刘舒,陈世钊,林玉. 基于Python的大规模高性能LBM多相流模拟 Large-scale High-performance Lattice Boltzmann Multi-phase Flow Simulations Based on Python 计算机科学, 2020, 47(1): 17-23. https://doi.org/10.11896/jsjkx.190500009 |
[9] | 李芳,李志辉,徐金秀,范昊,褚学森,李新亮. 基于十亿亿次国产超算系统的流体力学软件众核适应性研究 Research on Adaptation of CFD Software Based on Many-core Architecture of 100P Domestic Supercomputing System 计算机科学, 2020, 47(1): 24-30. https://doi.org/10.11896/jsjkx.181102176 |
[10] | 张凌浩, 桂盛霖, 穆逢君, 王胜. 基于后缀树的二进制可执行代码的克隆检测算法 Clone Detection Algorithm for Binary Executable Code with Suffix Tree 计算机科学, 2019, 46(10): 141-147. https://doi.org/10.11896/jsjkx.180801573 |
[11] | 周蓓, 黄永忠, 许瑾晨, 郭绍忠. 向量数学库的向量化方法研究 Study on SIMD Method of Vector Math Library 计算机科学, 2019, 46(1): 320-324. https://doi.org/10.11896/j.issn.1002-137X.2019.01.050 |
[12] | 姚金阳, 赵荣彩, 王琦, 李颖颖. 面向间接数组索引的向量化方法 Vectorization Methods for Indirect Array Index 计算机科学, 2018, 45(9): 220-223. https://doi.org/10.11896/j.issn.1002-137X.2018.09.036 |
[13] | 孙涛, 张俊星. SDN性能优化技术研究综述 Review of SDN Performance Optimization Technology 计算机科学, 2018, 45(11A): 84-91. |
[14] | 赵澄, 陈君新, 姚明海. 基于SVM分类器的XSS攻击检测技术 XSS Attack Detection Technology Based on SVM Classifier 计算机科学, 2018, 45(11A): 356-360. |
[15] | 吴卫祖,刘利群,谢冬青. 基于神经网络的异构网络向量化表示方法 Vectorized Representation of Heterogeneous Network Based on Neural Networks 计算机科学, 2017, 44(5): 272-275. https://doi.org/10.11896/j.issn.1002-137X.2017.05.049 |
|