计算机科学 ›› 2013, Vol. 40 ›› Issue (10): 190-193.
陈德华,周蒙,孙延青,郑亮亮
CHEN De-hua,ZHOU Meng,SUN Yan-qing and ZHENG Liang-liang
摘要: 图的稀疏化是图聚类分析中数据预处理的关键操作,已得到广泛的关注。针对图数据日益普及、规模不断增大的现状,提出了一种基于MapReduce的面向大规模图的稀疏化算法,即MR-GSpar算法。该算法在MapReduce并行计算框架的基础上,通过对传统的最小哈希(Minhash)算法的并行化改造,使其可在分布式的集群环境中实现对大规模图数据的高效稀疏化处理。真实数据集上的实验表明了该算法的可行性与有效性。
[1] Satuluri V,Parthasarathy S.Scalable graph clustering using stochastic flows:applications to community discovery[C]∥ACM SIGKDD.2009:737-746 [2] Kulis B,Basu S,Dhillon I,et al.Semi-supervised Graph Clustering:A Kernel Approach[J].Machine Learning,2009,74(1):1-22 [3] Satuluri V,Parthasarathy S,Ruan Y.Local graph Sparsification for Scalable Clustering[C]∥SIGMOD.2011:737-746 [4] 李建江,崔健,王聃,等.MapReduce并行编程模型研究综述[J].电子学报,2011,39(11):2635-2642 [5] Lin J,Schatz M.Design patterns for efficient graph algorithms in mapreduce[C]∥MLG.2010:78-85 [6] 尹丹,高宏,邹兆年,等.一种新的高效图聚集算法[J].计算机研究与发展,2011,48(10):1831-1841 [7] Lv Qin,Josephson W,Wang Zhe,et al.Multi-probe LSH:efficient indexing for high-dimensional similarity search[C]∥Proc of the 33rd Int Conf on Very Large Data Bases(VLDB’07). Vien-na Austria:VLDB Endowment,2007:950-961 |
No related articles found! |
|