Computer Science ›› 2019, Vol. 46 ›› Issue (1): 314-319.doi: 10.11896/j.issn.1002-137X.2019.01.049

• Interdiscipline & Frontier • Previous Articles     Next Articles

Optimization of Breadth-first Search Algorithm Based on Many-core Platform

XU Qi-ze, HAN Wen-ting, CHEN Jun-shi, AN Hong   

  1. (School of Computer Science and Technology,University of Science and Technology of China,Hefei 230027,China)
  • Received:2018-01-18 Online:2019-01-15 Published:2019-02-25

Abstract: Graph algorithms have important application value in many fields.With the development of social informatization,the amount of graph data to be processed has become larger and larger,the performance of graph algorithms has become a research hotspot.Breadth-first search algorithm is an important graph algorithm,studying its performance optimization techniques can provide reference for the performance optimization of other graph algorithms.The current work on new generation Xeon Phi many-core processors are based on top-down algorithm and do not take into account NUMA effects on performance.Based on the hybrid breadth-first search algorithm and the NUMA topology,this paper optimized Breadth-first search algorithm from the perspective of tasks allocation,vectorization and data preprocessing,designed and implemented a high-performance parallel breadth-first search algorithm on the Xeon Phi platform.A series of experimental results show that the optimized algorithm has gained 50%~145% improvement compared with Graph500 official tuned algorithm on different scales of test data.

Key words: Breadth-first search, Many-core architecture, NUMA, Performance optimization, Vectorization

CLC Number: 

  • TP302
[1]LIU H,HUANG H H.Enterprise:breadth-first graph traversal on GPUs[C]//SC15:International Conference for High Performance Computing,Networking,Storage and Analysis.2015:1-12.<br /> [2]BEAMER S,BULUC A,ASANOVIC K,et al.Distributed Memory Breadth-First Search Revisited:Enabling Bottom-Up Search[C]//2013 IEEE International Symposium on Parallel & Distributed Processing,Workshops and Phd Forum.2013:1618-1627.<br /> [3]LIN H,TANG X C,YU B W,et al.Scalable Graph Traversal on Sunway TaihuLight with Ten Million Cores[C]//International Parallel and Distributed Processing Symposium (IPDPS).2017:635-645.<br /> [4]PAREDES M,RILEY G,LUJAN M.Breadth first search vectorization on the Intel Xeon Phi[C]//Proceedings of the ACM International Conference on Computing Frontiers (CF’16).2016:1-10.<br /> [5]GAO T,LU Y T,ZHANG B D,et al.Using the Intel Many Integrated Core to accelerate graph traversal[J].International Journal of High Performance Computing Applications,2014,28(3):255-266<br /> [6]BESTA M,MARENDING F,SOLOMONIK E,et al.SlimSell:A Vectorizable Graph Representation for Breadth-First Search[C]//2017 IEEE International Parallel and Distributed Proces-sing Symposium (IPDPS).2017:32-41.<br /> [7]AGARWAL V,PETRINI F,PASETTO D,et al.Scalable Graph Exploration on Multicore Processors[C]//In Proceedings of the 2010 ACM/IEEE International Conference for High Perfor-mance Computing,Networking,Storage and Analysis (SC’10).2010:1-11.<br /> [8]YASUI Y,FUJISAWA K.Fast and scalable NUMA-based thread parallel breadth-first search[C]//2015 International Conference on High Performance Computing & Simulation (HPCS).2015:377-385.<br /> [9]Graph500[OL].https://graph500.org.<br /> [10]LESKOVEC J,CHAKRABARTI D,KLEINBERG J,et al.Kronecker Graphs:An Approach to Modeling Networks[J].The Journal of Machine Learning Research,2010,11(3):985-1042.<br /> [11]Sparse Matrix[OL].https://en.wikipedia.org/wiki/Sparse_ma-trix.<br /> [12]SODANI A.Knights landing (KNL):2nd Generation Intel<sup>®</sup> Xeon Phi processor[C]//2015 IEEE Hot Chips 27 Symposium (HCS).2015:1-24.<br /> [13]HENG D D,TANG Y H,YI X D,et al.Implementation and performance analysis of BFS algorithm on a parallel prototype system[J].Computer Engineering & Science,2017,39(1):27-34.(in Chinese)<br /> 衡冬冬,唐玉华,易晓东,等.并行原型系统上BFS算法设计实现与测试分析[J].计算机工程与科学,2017,39(1):27-34.
[1] CHEN Jun-wu, YU Hua-shan. Strategies for Improving Δ-stepping Algorithm on Scale-free Graphs [J]. Computer Science, 2022, 49(6A): 594-600.
[2] YE Yue-jin, LI Fang, CHEN De-xun, GUO Heng, CHEN Xin. Study on Preprocessing Algorithm for Partition Reconnection of Unstructured-grid Based on Domestic Many-core Architecture [J]. Computer Science, 2022, 49(6): 73-80.
[3] CHEN Le, GAO Ling, REN Jie, DANG Xin, WANG Yi-hao, CAO Rui, ZHENG Jie, WANG Hai. Adaptive Bitrate Streaming for Energy-Efficiency Mobile Augmented Reality [J]. Computer Science, 2022, 49(1): 194-203.
[4] E Hai-hong, ZHANG Tian-yu, SONG Mei-na. Web-based Data Visualization Chart Rendering Optimization Method [J]. Computer Science, 2021, 48(3): 119-123.
[5] ZHANG Xiao, ZHANG Si-meng, SHI Jia, DONG Cong, LI Zhan-huai. Review on Performance Optimization of Ceph Distributed Storage System [J]. Computer Science, 2021, 48(2): 1-12.
[6] XU Jiang-feng and TAN Yu-long. Research on HBase Configuration Parameter Optimization Based on Machine Learning [J]. Computer Science, 2020, 47(6A): 474-479.
[7] ZHANG Peng-yi, SONG Jie. Research Advance on Efficiency Optimization of Blockchain Consensus Algorithms [J]. Computer Science, 2020, 47(12): 296-303.
[8] XU Chuan-fu,WANG Xi,LIU Shu,CHEN Shi-zhao,LIN Yu. Large-scale High-performance Lattice Boltzmann Multi-phase Flow Simulations Based on Python [J]. Computer Science, 2020, 47(1): 17-23.
[9] LI Fang,LI Zhi-hui,XU Jin-xiu,FAN Hao,CHU Xue-sen,LI Xin-liang. Research on Adaptation of CFD Software Based on Many-core Architecture of 100P Domestic Supercomputing System [J]. Computer Science, 2020, 47(1): 24-30.
[10] ZHANG Ling-hao, GUI Sheng-lin, MU Feng-jun, WANG Sheng. Clone Detection Algorithm for Binary Executable Code with Suffix Tree [J]. Computer Science, 2019, 46(10): 141-147.
[11] YAO Jin-yang, ZHAO Rong-cai, WANG Qi, LI Ying-ying. Vectorization Methods for Indirect Array Index [J]. Computer Science, 2018, 45(9): 220-223.
[12] ZHAO Cheng, CHEN Jun-xin, YAO Ming-hai. XSS Attack Detection Technology Based on SVM Classifier [J]. Computer Science, 2018, 45(11A): 356-360.
[13] SUN Tao, ZHANG Jun-xing. Review of SDN Performance Optimization Technology [J]. Computer Science, 2018, 45(11A): 84-91.
[14] LI Rui-long, LIANG Yuan and ZHANG Song-hai. Cartoon Animations Segmentation and Vectorization Based on Canny Optimization [J]. Computer Science, 2017, 44(8): 27-30.
[15] SUN Zhi-long, Edwin H-M Sha, ZHUGE Qing-feng, CHEN Xian-zhang and WU Kai-jie. Research on Data Consistency for In-memory File Systems [J]. Computer Science, 2017, 44(2): 222-227.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!