计算机科学 ›› 2021, Vol. 48 ›› Issue (10): 152-159.doi: 10.11896/jsjkx.201100005

• 数据库&大数据&数据科学 • 上一篇    下一篇

时序图中Top-k稠密子图查询算法研究

穆聪聪, 王一舒, 袁野, 乔百友, 马玉亮   

  1. 东北大学计算机科学与工程学院 沈阳110000
  • 收稿日期:2020-11-02 修回日期:2021-03-16 出版日期:2021-10-15 发布日期:2021-10-18
  • 通讯作者: 袁野(yuanye@mail.neu.edu.cn)
  • 作者简介:mucong@stumail.neu.edu.cn
  • 基金资助:
    国家重点研发计划项目(2016YFCI401900)

Top-k Densest Subgraphs Search in Temporal Graphs

MU Cong-cong, WANG Yi-shu, YUAN Ye, QIAO Bai-you, MA Yu-liang   

  1. School of Computer Science and Engineering,Northeastern University,Shenyang 110000,China
  • Received:2020-11-02 Revised:2021-03-16 Online:2021-10-15 Published:2021-10-18
  • About author:MU Cong-cong,born in 1995,postgra-duate,is a member of China Computer Federation.His main research interests include temporal graph and graph data managment.
    YUAN Ye,born in 1981,professor,is a member of China Computer Federation.His main research interests include graph databases,probabilistic databa-ses,and social network analysis.
  • Supported by:
    National Key Research and Development Project(2016YFCI401900).

摘要: 稠密子图的查询是图分析领域的重要研究问题之一,在社交用户相关性分析、Web中社群分析等方面都有着广泛的应用。目前,关于稠密子图查询的研究工作主要基于静态图。而在实际应用中,时序信息会对稠密子图查询产生重要的影响,使得图拓扑结构随时间序列不断发生变化,包含的信息量也不断增加,使得已有的针对静态图的查找方法不再适用于时序图。因此,如何高效地在时序图上查找稠密子图仍然是一个挑战。为了解决上述挑战,首先规范化地定义了基于时序图的稠密子图查找问题;然后,根据图的拓扑结构和包含时间标签的边之间的相似度,提出一种基于阈值的近似查找算法DTS-base。为了加快算法的收敛速度,提出了一个基于快速计算最大相似度时间片的优化算法DTS-opt。最后,通过在真实数据集上的实验,证明了所提算法的高效性和可扩展性。

关键词: Top-k查询, 稠密子图, 时序图

Abstract: The dense subgraph search problem is one of the most important graph analysis problems.It is widely used in many fields,such as the social user correlation analysis in social networks,the community discovery in the Web,etc.However,the current research focuses on searching dense subgraphs on static graphs.In practical application,temporal information has an important impact on the query of dense subgraphs,which makes the topology structure of graphs change constantly with time sequences,and the amount of information contained also increases dramatically.Therefore,the existing searching methods for static graphs are no longer applicable to temporal graphs.Hence,it is still a challenge to search dense subgraphs efficiently on a temporal graph.In order to solve the above challenge,this paper first defines the Top-k densest temporal subgraphs searching problems in a standardized way.Then,to address the above challenge,this paper proposes an approximate searching algorithm DTS-base based on threshold according to the topology of the graph and the similarity between edges containing time tags.Furthermore,an optimization algorithm DTS-opt based on the fast calculation of maximum similarity time slice is proposed in order to accelerate the convergence speed.Finally,experiments on real data sets demonstrate the efficiency and scalability of the proposed algorithms.

Key words: Densest subgraph, Temporal graph, Top-k query

中图分类号: 

  • TP399
[1]FRATKIN E,NAUGHTON B T,BRUTLAG D L,et al.Motifcut:regulatory motifs finding with maximum density subgraphs[J].Bioinformatics,2006,22(14):150-157.
[2]RAVI K,PRABHAKAR R,SRIDHAR R,et al.Trawling the web for emerging cyber-communities[J].Computer Networks,1999,31(11-16):1481-1493.
[3]SOZIO M,GIONIS A.The community-search problem and how to plan a successful cocktail party [C]//Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Disco-very and Data Mining (KDD).Washington,DC,USA,2010:939-948.
[4]CHENG J,KE Y,CHU S,et al.Efficient core decomposition in massive networks[C]//Proceedings of the 27th International Conference on Data Engineering.Hannover,Germany,2011:51-62.
[5]UNO T.An efficient algorithm for solving pseudo clique enu-meration problem[J].Algorithmica,2010,56(1):3-16.
[6]CHENG J,KE Y,FU W C,et al.Finding maximal cliques in massive networks by h*-graph [C]//Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data.Indianapolis,Indiana,USA,2010:447-458.
[7]TSOURAKAKIS C.The k-clique densest subgraph problem[C]//Proceedings of the 24th International Conference on World Wide Web.Florence,Italy,2015:1122-1132.
[8]ABELLO J,RESENDE M G C,SUDARSKY S.Massive quasi-clique detection [C]//Theoretical Informatics,5th Latin American Symposium.Cancun,Mexico,2002:598-612.
[9]MITZENMACHER M,PACHOCKI J,PENG R,et al.Scalable large near-clique detection in large-scale networks via sampling [C]//Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Sydney,NSW,Australia,2015:815-824.
[10]LI R H,QIN L,YU J X,et al.Influential community search in large networks[J].Proceedings of the VLDB Endowment,2015,8(5):509-520.
[11]MA S,HU R J,WANG L,et al.Fast Computation of Dense Temporal Subgraphs [C]//Proceeding of the 33rd IEEE International Conference on Data Engineering.San Diego,CA,USA,2017:361-372.
[12]YANG Y,YAN D,WU H H,et al.Diversified Temporal Subgraph Pattern Mining [C]//Proceeding of the 22nd ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining.New York,USA,2016:1965-1974.
[13]TSOURAKAKIS C,BONCHI F,GIONIS A,et al.Denser than the densest subgraph:extracting optimal quasi-cliques with qua-lity guarantees [C]//Proceeding of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Chicago,IL,USA,2013:104-112.
[14]GOLDBERG A V.Finding a maximum density subgraph[D].Berkeley:University of California,1984.
[15]CHARIKAR M.Greedy approximation algorithms for findingdense components in a Graph[C]//Proceeding of the Approximation Algorithms for Combinatorial Optimization,Third International Workshop.Saarbrücken,Germany,2000:84-95.
[16]YANG Y J,GAO H,LI J Z.Survey on querying and mining algorithms on dynamic graphs[J].Intelligent Computer and Application,2013,3(4):24-27.
[17]SARMA A D,LALL A,NANONGKAI D,et al.Dense sub-graphs on dynamic networks [C]//Proceeding of the Distributed Computing-26th International Symposium.Salvador,Brazil,2012:151-165.
[18]ANGEL A,KOUDAS N,SARKAS N,et al.Dense subgraphmaintenance under streaming edge weight updates for real-time story identification[J].The VLDB Journal,2014,23(2):175-199.
[19]VALARI E,KONTAKI M,PAPADOPOULOS A N.Discoveryof top-k dense subgraphs in dynamic graph collections [C]//Proceeding of the Scientific and Statistical Database Management-24th International Conference.Chania,Crete,Greece,2012:213-230.
[20]WU H,HUANG Y,CHENG J,et al.Efficient processing ofreachability and time-based path queries in a temporal graph [C]//Proceedings of the 32nd International Conference on Data Engineering.Helsinki,Finland,2016:145-156.
[21]SEMERTZIDIS K,PITOURA E,TERZI E,et al.Finding la-sting dense subgraphs[J].Data Mining and Knowledge Disco-very,2019,33(5):1417-1445.
[22]LI Y,LIU J S,ZHAO H Q,et al.Research on Sustained Cohesive Subgraph Search Algorithm in Large Temporal Graphs[J].Computer Engineering and Applications,2021,4:1-11.
[1] 王金明,王远方.
基于Twitter Storm平台并行挖掘最稠密子图
Parallel Mining of Densest Subgraph Based on Twitter Storm
计算机科学, 2014, 41(1): 274-278.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!