Computer Science ›› 2014, Vol. 41 ›› Issue (1): 274-278.

Previous Articles     Next Articles

Parallel Mining of Densest Subgraph Based on Twitter Storm

WANG Jin-ming and WANG Yuan-fang   

  • Online:2018-11-14 Published:2018-11-14

Abstract: In large scale graph,finding densest subgraph has a wide range of applications,such as community discovery,spam detection and reference relation extraction.Based on tagged undirected graph,we introduced the concept of QLS and designed an approximation algorithm DSFLC which can quickly find the densest subgraph:users submit a QLS and the algorithm will return the densest subgraph under QLS within the time that user can accept.For any ε>0,DSFLC only needs to scan large-scale data sets O(log1+εn)times,and can ensure the approximation algorithm is a 2(1+ε)-approximation algorithm.After analyzing DSFLC,we found this algorithm is easy to parallelize,so we chose Twitter Storm platform to parallel DSFLC algorithm.Finally,the test data sets extracted from the DBLP database verify DSFLC’ practicality and scalability.

Key words: Densest subgraph finding,QLS,DSFLC algorithm,Twitter storm platform

[1] Yon D,Filippo G,Marco P.Extraction and classification of dense communities in the Web[C]∥WWW 2007.2007:461-470
[2] Newman M E J.Modularity and community structure in net-works[C]∥PNAS 2006.2006:8577-8582
[3] William F G,Steve L,Lee G C.Efficient identification of Web communities[C]∥KDD 2000.2000:150-160
[4] https://github.com/nathanmarz/storm
[5] http://www.dblp.org/db/
[6] Goldberg A V.Finding a maximum density Subgraph[R].UCB/CSD-84-171.EECS Department,University of California,Berkeley,1984
[7] Charikar M.Greedy approximation algorithms for finding dense components in a graph[C]∥APPROX 2000.2000:84-90
[8] Lawler F.Combinatorial Optimization:Networks and Matroids[M].Holt,Rinehart,and Winston,1976
[9] Gibson D,Kumar R,Tomkins A.Discovering large dense subgraphs in massive graphs[C]∥VLDB 2005.2005:721-732
[10] Bahmani B,umar R.Sergei Vassilvitskii:Densest Subgraph inStreaming and MapReduce[C]∥VLDB 2012.2012:454-465
[11] Silva A,Meira W Jr,Zaki M J.Mining Attribute-structure Correlated Patterns in Large Attributed Graphs[C]∥VLDB.2012:466-477
[12] Jeffrey D,Sanjay G.MapReduce:simplified data processing onlarge clusters[C]∥ACM 2008.2008:107-113
[13] Bu Ying-yi,Howe B,Balazinska M,et al.HaLoop:Efficient Iterative Data Processing on Large Clusters[C]∥VLDB 2010.2010:285-296
[14] Abouzeid A,Bajda-Pawlikowski K,Abadi D J,et al.HadoopDB:An architectural hybrid of MapReduce and DBMS technologies for analytical workloads[C]∥VLDB 2009.2009:922-933
[15] Condie T,Conway N,Alvaro P,et al.MapReduce Online[C]∥NSDI’10Proceedings of the 7th USENIX Conference on Networked Systems Design and Implementation.USENIX Association Berkeley,CA,USA,2010:21

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!