计算机科学 ›› 2016, Vol. 43 ›› Issue (4): 231-234.doi: 10.11896/j.issn.1002-137X.2016.04.047

• 人工智能 • 上一篇    下一篇

一种松弛的优化均衡流式图划分算法研究

殷晓波,罗恩   

  1. 安徽理工大学计算机科学与工程学院 淮南232001,中南大学信息科学与工程学院 长沙410083
  • 出版日期:2018-12-01 发布日期:2018-12-01
  • 基金资助:
    本文受安徽省科技厅自然科学基金重点项目(1308085MF94)资助

Relaxed Optimal Balanced Streaming Graph Partitioning Algorithm

YIN Xiao-bo and LUO En   

  • Online:2018-12-01 Published:2018-12-01

摘要: 在大规模图数据的分布式处理中,往往需要将图数据进行划分并放置在不同的节点上。如果数据划分得不均衡,那么部分节点可能会成为分布式系统的瓶颈。为了提高图数据划分的均衡性,并且有效地应对图数据的快速更新,提出了一种松弛的优化均衡流式图划分算法。首先,定义了一种同时包含划分内部代价和划分之间的割的代价的目标函数作为图划分的整体框架。然后,在图划分框架的基础上通过最大化和最小化两种优化函数分析了均衡图划分问题,并给出了二者之间的关系。最后,针对流式图数据,提出一种贪婪的图最优k划分算法。该划分算法以最大化优化函数为基础,通过最大化顶点放置产生的目标函数增加值进行节点划分块的选取。实验表明,提出的图划分算法与相关算法相比,不仅均衡性好,而且通信开销小,在基于该算法进行图划分时上层应用的计算性能得到了明显的提高。

关键词: 图,均衡图划分,分布式计算,启发式算法

Abstract: During the distributed processing of large scale graph data,it usually needs to partition the whole graph into different nodes.If the partition isn’t balanced,some of the nodes may be the bottleneck of the distributed system.In order to achieve a balanced partition and handle the quick updates of graph efficiently,this paper proposed a relaxed optimal balanced streaming graph partitioning algorithm.Firstly,we defined an objective function including costs of both inter-partitions and intra-partition as the general graph partition framework.Secondly,we analyzed the graph partition problem according to a maximal and a minimal optimization functions based within the proposed framework,and gave their relationships.Finally,we proposed a greedy optimal graph k-partitions algorithm upon the streaming graph.The experiments show that,the proposed graph partitioning algorithm have better balance and lower communication cost than related works,and upper applications upon this algorithm have better performance.

Key words: Graph,Balanced graph partitioning,Distributed computing,Heuristic algorithm

[1] Kwak H,Lee C,Park H,et al.What is Twitter,a social network or a news media?[C]∥Proceedings of the 19th International Conference on World Wide Web.ACM,2010:591-600
[2] Xu Da-chuan,Han Ji-ye,Du Dong-lei.Improved Approximation Algorithm for Graph Partition Problem[J].Acta Mathematica Applicatae Sinica,2006,8(4):587-597(in Chinese) 徐大川,韩继业,杜东雷.关于图划分问题的改进的近似算法[J].应用数学学报,2006,28(4):587-597
[3] Malewicz G,Austern M H,Bik A J C,et al.Pregel:a system for large-scale graph processing[C]∥Proceedings of the 2010 ACM SIGMOD International Conference on Management of data.ACM,2010:135-146
[4] Kang U,Tsourakakis C E,Faloutsos C.Pegasus:A peta-scalegraph mining system implementation and observations[C]∥Ninth IEEE International Conference on Data Mining,2009(ICDM609).IEEE,2009:229-238
[5] Low Y,Bickson D,Gonzalez J,et al.Distributed GraphLab:a framework for machine learning and data mining in the cloud[J].Proceedings of the VLDB Endowment,2012,5(8):716-727
[6] Szell M,Lambiotte R,Thurner S.Multirelational organization of large-scale social networks in an online world[J].Proceedings of the National Academy of Sciences,2010,107(31):13636-13641
[7] Garey M R,Johnson D S,Stockmeyer L.Some simplified NP-complete problems[C]∥Proceedings of the sixth Annual ACM Symposium on Theory of Computing.ACM,1974:47-63
[8] Zhou Shuang,Bao Yu-bin,Wang Zhi-gang,et al.BHP:BSP Mo-del Oriented Hash Graph Data Partition with Load Balancing[J].Journal of Frontiers of Computer Science & Technology,2014,8(1):40-50(in Chinese) 周爽,鲍玉斌,王志刚,等.BHP:面向 BSP 模型的负载均衡 Hash 图数据划分[J].计算机科学与探索,2014,8(1):40-50
[9] Ke Q,Prabhakaran V,Xie Y,et al.Optimizing data partitioning for data-parallel computing[C]∥HotOS XIII.May 2011
[10] Berger-Wolf T Y,Saia J.A framework for analysis of dynamic social networks[C]∥Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mi-ning.ACM,2006:523-528
[11] Naaman M,Boase J,Lai C H.Is it really about me?message content in social awareness streams[C]∥Proceedings of the 2010 ACM Conference on Computer Supported Cooperative Work.ACM,2010:189-192
[12] Stanton I,Kliot G.Streaming graph partitioning for large distributed graphs[C]∥Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mi-ning.ACM,2012:1222-1230
[13] Krauthgamer R,Naor J S,Schwartz R.Partitioning graphs into balanced components[C]∥Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms.Society for Industrial and Applied Mathematics,2009:942-949
[14] Lin Jiao,Chen Wen-guang,Li Qiang,et al.A New Data Clustering Algorithm for Parallel Whole-Genome Shotgun Sequence Assembly[J].Journal of Computer Research and Development,2006,3(8):1323-1329(in Chinese) 林皎,陈文光,栗强,等.基于图划分的全基因组并行拼接算法[J].计算机研究与发展,2006,43(8):1323-1329
[15] Karypis G,Kumar V.A fast and high quality multilevel scheme for partitioning irregular graphs[J].SIAM Journal on scientific Computing,1998,20(1):359-392
[16] Prabhakaran V,Wu M,Weng X,et al.Managing Large Graphs on Multi-Cores with Graph Awareness[C]∥USENIX Annual Technical Conference.2012:41-52
[17] Newman M E J.The structure and function of complex net-works[J].SIAM Review,2003,45(2):167-256
[18] Tsourakakis C,Bonchi F,Gionis A,et al.Denser than the densest subgraph:Extracting optimal quasi-cliques with quality guarantees[C]∥Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.ACM,2013:104-112
[19] Stanton I,Kliot G.Streaming graph partitioning for large distributed graphs[C]∥Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mi-ning.ACM,2012:1222-1230
[20] Zafarani R,Cole W D,Liu H.Sentiment propagation in social networks:a case study in livejournal[M]∥Advances in Social Computing.Springer Berlin Heidelberg,2010:413-420
[21] Leng Yong-lin,Shen Hua,Lu Fu-yu.Distributed Storage of RDF Directed Graph Based on P-Rank Algorithm[J].Journal of Chongqing University of Technology (Natural Science),2015,9(1):91-95(in Chinese) 冷泳林,申华,鲁富宇.基于P-Rank的RDF有向图的分布式存储[J].重庆理工大学学报(自然科学),2015,29(1):91-95

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!