摘要: 在大规模图结构数据中发现最稠密子图具有极其广泛的应用,如社区发现、垃圾邮件检测和论文引用关系抽取等。基于带标签的无向图,提出了查询标签集的概念,设计了一个可以快速发现最稠密子图的近似算法DSFLC(Densest Subgraph Finding based on Labelset Constraint ):用户提交自定义的查询标签集,算法便可保证在用户可以接受的时间内返回满足查询标签集约束的最稠密子图。对于任何参数ε(ε>0),DSFLC算法只需扫描大规模数据集O(log1+εn)次,同时可保证算法的近似因子是2(1+ε)。对DSFLC算法进行分析后,发现该算法在预处理阶段易于并行化,因此选择Twitter Storm平台,并行化地实现了DSFLC算法。最后对从DBLP数据库中抽取的合作关系图进行测试,一方面研究Storm平台对算法的加速程度;另一方面分析挖掘出的子图的稠密度与参数ε之间的关系,最终验证了DSFLC算法的实用性和可扩展性。
[1] Yon D,Filippo G,Marco P.Extraction and classification of dense communities in the Web[C]∥WWW 2007.2007:461-470 [2] Newman M E J.Modularity and community structure in net-works[C]∥PNAS 2006.2006:8577-8582 [3] William F G,Steve L,Lee G C.Efficient identification of Web communities[C]∥KDD 2000.2000:150-160 [4] https://github.com/nathanmarz/storm [5] http://www.dblp.org/db/ [6] Goldberg A V.Finding a maximum density Subgraph[R].UCB/CSD-84-171.EECS Department,University of California,Berkeley,1984 [7] Charikar M.Greedy approximation algorithms for finding dense components in a graph[C]∥APPROX 2000.2000:84-90 [8] Lawler F.Combinatorial Optimization:Networks and Matroids[M].Holt,Rinehart,and Winston,1976 [9] Gibson D,Kumar R,Tomkins A.Discovering large dense subgraphs in massive graphs[C]∥VLDB 2005.2005:721-732 [10] Bahmani B,umar R.Sergei Vassilvitskii:Densest Subgraph inStreaming and MapReduce[C]∥VLDB 2012.2012:454-465 [11] Silva A,Meira W Jr,Zaki M J.Mining Attribute-structure Correlated Patterns in Large Attributed Graphs[C]∥VLDB.2012:466-477 [12] Jeffrey D,Sanjay G.MapReduce:simplified data processing onlarge clusters[C]∥ACM 2008.2008:107-113 [13] Bu Ying-yi,Howe B,Balazinska M,et al.HaLoop:Efficient Iterative Data Processing on Large Clusters[C]∥VLDB 2010.2010:285-296 [14] Abouzeid A,Bajda-Pawlikowski K,Abadi D J,et al.HadoopDB:An architectural hybrid of MapReduce and DBMS technologies for analytical workloads[C]∥VLDB 2009.2009:922-933 [15] Condie T,Conway N,Alvaro P,et al.MapReduce Online[C]∥NSDI’10Proceedings of the 7th USENIX Conference on Networked Systems Design and Implementation.USENIX Association Berkeley,CA,USA,2010:21 |
No related articles found! |
|