计算机科学 ›› 2016, Vol. 43 ›› Issue (9): 61-65.doi: 10.11896/j.issn.1002-137X.2016.09.011
• 2015 年第三届CCF 大数据学术会议 • 上一篇 下一篇
张忠正,李建武
ZHANG Zhong-zheng and LI Jian-wu
摘要: 处理海量级数据的有效途径之一是将算法分解为一系列互不依赖的任务,然后利用开源工具并行地执行算法。而在重叠社区发现算法中,基于局部拓展的方法在拓展阶段往往仅需要局部社区及其相应的邻居结点的信息,因而具备可并行执行的可能性。提出了一种可并行化执行的局部拓展算法,并借助开源工具Spark将其实现。算法分为4个阶段。首先,挑选出一组不相关的中心结点并使用其对应的局部网络作为种子;其次,通过删除本身连接不是很紧密的局部网络来过滤选出的种子;然后,采用一种批量式的拓展策略来拓展种子,即一次向局部社区中添加一批邻居结点或从社区中删除一批结点;最后,融合相似的社区。在人工生成的网络以及真实世界中的网络上的实验结果显示 ,所提算法既准确又高效。
[1] Newman M E J.The structure of scientific collaboration networks[J].Proceedings of the National Academy of Sciences,2001,98(2):404-409 [2] Girvan M,Newman M E J.Community structure in social and biological networks[J].Proceedings of the National Academy of Sciences,2002,99(12):7821-7826 [3] Deng K,Zhang J,Yang J.Mobile Recommendation Based onLink Community Detection[J].Scientific World Journal,2013,4(11):259156 [4] Sahebi S,Cohen W W.Community-based recommendations:asolution to the cold start problem[C]∥Workshop on Recommender Systems and the Social Web.RSWEB,2011 [5] Au Yeung C,Gibbins N,Shadbolt N.Contextualising tags incollaborative tagging systems[C]∥Proceedings of the 20th ACM Conference on Hypertext and Hypermedia.ACM,2009:251-260 [6] Palla G,Derényi I,Farkas I,et al.Uncovering the overlapping community structure of complex networks in nature and society[J].Nature,2005,435(7043):814-818 [7] Ahn Y Y,Bagrow J P,Lehmann S.Link communities revealmultiscale complexity in networks[J].Nature,2010,466(7307):761-764 [8] Gregory S.Finding overlapping communities in networks by label propagation[J].New Journal of Physics,2009,12(10):2011-2024 [9] Lancichinetti A,Fortunato S,Kertész J.Detecting the overlapping and hierarchical community structure in complex networks[J].New Journal of Physics,2009,11(3):033015 [10] Lee C,Reid F,McDaid A,et al.Detecting highly overlappingcommunity structure by greedy clique expansion[C]∥ International Conference on Paper Presented at Sna-kdd Workshop.2010 [11] Gleich D F,Seshadhri C.Vertex neighborhoods,low conduc-tance cuts,and good seeds for local community methods[C]∥Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.ACM,2012:597-605 [12] Whang J J,Gleich D F,Dhillon I S.Overlapping community detection using seed set expansion[C]∥Proceedings of the 22nd ACM International Conference on Conference on Information & Knowledge Management.ACM,2013:2099-2108 [13] Gargi U,Lu W,Mirrokni V S,et al.Large-Scale Community Detection on YouTube for Topic Discovery and Exploration[C]∥ICWSM.2011 [14] Dean J,Ghemawat S.MapReduce:simplified data processing on large clusters[J].Communications of the ACM,2008,51(1):107-113 [15] Zaharia M,Chowdhury M,Das T,et al.Resilient distributeddatasets:A fault-tolerant abstraction for in-memory cluster computing[C]∥Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation.USENIX Association,2012:2-2 [16] Watts D J,Strogatz S H.Collective dynamics of ‘small-world’networks[J].Nature,1998,393(6684):440-442 [17] Barabási A L,Albert R.Emergence of scaling in random networks[J].Science,1999,286(5439):509-512 [18] Dhillon I S,Guan Y,Kulis B.Weighted graph cuts without eigenvectors a multilevel approach[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2007,29(11):1944-1957 [19] Yang S,Wang B,Zhao H,et al.Efficient dense structure mining using MapReduce[C]∥IEEE International Conference on Data Mining Workshops,2009(ICDMW’09).IEEE,2009:332-337 [20] Tomita E,Tanaka A,Takahashi H.The worst-case time com-plexity for generating all maximal cliques and computational experiments[J].Theoretical Computer Science,2006,363(1):28-42 [21] Wu B,Du Y H.Cloud-based connected component algorithm[C]∥2010 International Conference on Artificial Intelligence and Computational Intelligence (AICI).IEEE,2010,3:122-126 [22] Raghavan U N,Albert R,Kumara S.Near linear time algorithm to detect community structures in large-scale networks[J].Physical Review E,2007,76(3):036106 [23] Zhao W,Martha V,Xu X.Pscan:A parallel structural clustering algorithm for big networks in mapreduce[C]∥2013 IEEE 27th International Conference on Advanced Information Networking and Applications (AINA).IEEE,2013:862-869 [24] Lancichinetti A,Fortunato S.Community detection algorithms:a comparative analysis[J].Physical review E,2009,80(5):056117 [25] Zachary W W.An information flow model for conflict and fission in small groups[J].Journal of Anthropological Research,1977,33(4):452-473 [26] Lusseau D,Schneider K,Boisseau O J,et al.The bottlenose dolphin community of Doubtful Sound features a large proportion of long-lasting associations[J].Behavioral Ecology and Sociobio-logy,2003,54(4):396-405 [27] Yang J,Leskovec J.Defining and Evaluating Network Communities Based on Ground-Truth[C]∥2012 IEEE 12th International Conference on Data Mining.IEEE Computer Society,2012:745-754 |
No related articles found! |
|