计算机科学 ›› 2024, Vol. 51 ›› Issue (1): 113-123.doi: 10.11896/jsjkx.231000153

• 数据库&大数据&数据科学 • 上一篇    下一篇

锚社区时序网络图生成算法

郑舒文, 王朝坤   

  1. 清华大学软件学院 北京100084
  • 收稿日期:2023-10-20 修回日期:2023-11-23 出版日期:2024-01-15 发布日期:2024-01-12
  • 通讯作者: 王朝坤(chaokun@tsinghua.edu.cn)
  • 作者简介:(zsw22@mails.tsinghua.edu.cn)
  • 基金资助:
    国家自然科学基金(62372264,61872207)

Generation Algorithm of Temporal Networks with Anchor Communities

ZHENG Shuwen, WANG Chaokun   

  1. School of Software,Tsinghua University,Beijing 100084,China
  • Received:2023-10-20 Revised:2023-11-23 Online:2024-01-15 Published:2024-01-12
  • About author:ZHENG Shuwen,born in 2000,master,is a member of CCF(No.P8989G).Her main research interests include social network analysis and graph generation.
    WANG Chaokun,born in 1976,Ph.D,associate professor,Ph.D supervisor,is a senior member of CCF(No.18328S).His main research interests include graph data management,social network analysis,and big data systems.
  • Supported by:
    National Natural Science Foundation of China(62372264,61872207).

摘要: 图数据相关分析任务往往需要合成数据集来检验和评估算法的有效性和高效性。真实世界图数据不仅在拓扑上具有社区结构特征,还往往在时序上呈现出一定的演化特性,社区节点可能在锚定时间窗口内频繁交互。然而,现有合成方法存在一定局限性。大多方法或仅关注网络中的社区结构,或仅关注网络中的时序信息,无法生成节点锚时频繁交互的社区。为克服此局限,提出了锚社区概念及定义以刻画社区内节点锚时频繁交互的特性;接着,基于分布概率生成模型提出了一般时序图生成算法;进一步地,提出了锚社区时序网络图生成算法(GTN-AC),不仅允许用户配置锚定时间窗口,还允许用户指定度数分布和时间戳分布。实验结果表明,相较于基准方法,GTN-AC能在保证较优生成质量的同时拥有较快的生成速度。

关键词: 时序网络, 锚定时间窗口, 锚社区, 分布概率生成模型, 图生成

Abstract: Algorithms for network analysis tasks require synthetic graph datasets to evaluate their effectiveness and efficiency.Real-world graph data not only possess topological features such as community structures,but also contain temporal information revealing evolutionary semantics.Nodes of real-world communities may interact with each other within a specific anchor time window.However,existing graph generation methods suffer from some limitations.Most of them concentrate on either static community structures or temporal graphs without community structures,appearing weak in generating communities active during an anchor time period.To surmount their weakness,this paper introduces the concept of anchor community to depict frequent interactions between a group of nodes within an anchor time window.Then it proposes an algorithm to synthesize general temporal networks based on the distribution probability generation model,and further proposes an efficient generation algorithm of temporal networks with anchor communities(GTN-AC),allowing configuration input such as anchor time windows as well as specified distributions of degree and timestamp.Extensive experimental results indicate that compared with other baseline methods,GTN-AC has a faster generation speed while ensuring preferable generation quality.

Key words: Temporal network, Anchor time window, Anchor community, Distribution probability generation model, Graph gene-ration

中图分类号: 

  • TP319
[1]FORTUNATO S.Community detection in graphs[J].PhysicsReports,2010,486(3/4/5):75-174.
[2]BLONDEL V D,GUILLAUME J L,LAMBIOTTE R,et al.Fast unfolding of communities in large networks[J].Journal of Statistical Mechanics:Theory and Experiment,2008,2008(10):P10008.
[3]VAN LIERDE H,CHOW T W S,CHEN G.Scalable spectral clustering for overlapping community detection in large-scale networks[J].IEEE Transactions on Knowledge and Data Engineering,2019,32(4):754-767.
[4]ZHU J C,WANG C K.Approaches to Community Search Under ComplexConditions[J].Journal of Software,2019,30(3):552-572.
[5]KIM J,LUO S,CONG G,et al.DMCS:Density modularitybased community search[C]//Proceedings of the 2022 International Conference on Management of Data.2022:889-903.
[6]XU T,LU Z,ZHU Y.Efficient Triangle-Connected Truss Community Search in Dynamic Graphs[J].Proceedings of the VLDB Endowment,2022,16(3):519-531.
[7]LANCICHINETTI A,FORTUNATO S.Benchmarks for tes-ting community detection algorithms on directed and weighted graphs with overlapping communities[J].Physical Review E,2009,80(1):016118.
[8]WANG C,WANG B,HUANG B,et al.Fastsgg:Efficient social graph generation using a degree distribution generation model[C]//2021 IEEE 37th International Conference on Data Engineering(ICDE).IEEE,2021:564-575.
[9]YOU J,YING R,REN X,et al.Graphrnn:Generating realistic graphs with deep auto-regressive models[C]//International Conference on Machine Learning.PMLR,2018:5708-5717.
[10]GIRVAN M,NEWMAN ME J.Community structure in social and biological networks[J].Proceedings of the National Academy of Sciences,2002,99(12):7821-7826.
[11]WANG M,WANG C,YU J X,et al.Community detection in social networks:an in-depth benchmarking study with a procedure-oriented framework[J].Proceedings of the VLDB Endowment,2015,8(10):998-1009.
[12]SRIHARI S,LEONG H W.Temporal dynamics of protein complexes in PPI networks:a case study using yeast cell cycle dynamics[C]//BMC bioinformatics.BioMed Central,2012:1-9.
[13]RIGAUT G,SHEVCHENKO A,RUTZ B,et al.A generic protein purification method for protein complex characterization and proteome exploration[J].Nature biotechnology,1999,17(10):1030-1032.
[14]LEVY E D,PEREIRA-LEAL J B.Evolution and dynamics of protein interactions and networks[J].Current Opinion in Structural Biology,2008,18(3):349-357.
[15]NOOREN I M A,THORNTON J M.Diversity of protein-protein interactions[J].The EMBO Journal,2003,22(14):3486-3492.
[16]OU-YANG L,DAI D Q,LI X L,et al.Detecting temporal protein complexes from dynamic protein-protein interaction networks[J].BMC bioinformatics,2014,15:1-14.
[17]RANI R R,RAMYACHITRA D,BRINDHADEVI A.Detection of dynamic protein complexes through markov clustering based on elephant herd optimization approach[J].Scientific Reports,2019,9(1):1-18.
[18]LEI X,DING Y,FUJITA H,et al.Identification of dynamic protein complexes based on fruit fly optimization algorithm[J].Knowledge-Based Systems,2016,105:270-277.
[19]WEN D,HUANG Y,ZHANG Y,et al.Efficiently answeringspan-reachability queries in large temporal graphs[C]//2020 IEEE 36th International Conference on Data Engineering(ICDE).IEEE,2020:1153-1164.
[20]PUROHIT S,HOLDER L B,CHIN G.Temporal graph generation based on a distribution of temporal motifs[C]//Proceedings of the 14th International Workshop on Mining and Learning with Graphs.2018:7.
[21]MASSRI M,MIKLOS Z,RAIPIN P,et al.RTGEN++:A relative temporal graph generator[J].Future Generation Computer Systems,2023,146:139-155.
[22]BOJCHEVSKI A,SHCHURO,ZÜGNER D,et al.Netgan:Ge-nerating graphs via random walks[C]//International conference on machine learning.PMLR,2018:610-619.
[23]ZHANG L,ZHAO L,QIN S,et al.TG-GAN:Continuous-time temporal graph deep generative models with time-validity constraints[C]//Proceedings of the Web Conference 2021.2021:2104-2116.
[24]SIMONOVSKY M,KOMODAKIS N.Graphvae:Towards generation of small graphs using variational autoencoders[C]//Artificial Neural Networks and Machine Learning-ICANN 2018:27th International Conference on Artificial Neural Networks,Rhodes,Greece,October 4-7,2018,Proceedings,Part I 27.Springer International Publishing,2018:412-422.
[25]XIANG S,CHENG D,ZHANG J,et al.Efficient Learning-based Community-Preserving Graph Generation[C]//2022 IEEE 38th International Conference on Data Engineering(ICDE).IEEE,2022:1982-1994.
[26]MISLOVE A,MARCON M,GUMMADI K P,et al.Measure-ment and analysis of online social networks[C]//Proceedings of the 7th ACM SIGCOMM Conference on Internet Measurement.2007:29-42.
[27]BARABÁSI A L,ALBERT R.Emergence of scaling in random networks[J].Science,1999,286(5439):509-512.
[28]PALLA G,DERÉNYI I,FARKAS I,et al.Uncovering theoverlapping community structure of complex networks in nature and society[J].Nature,2005,435(7043):814-818.
[29]CHAKRABARTI D,ZHAN Y,FALOUTSOS C.R-MAT:A recursive model for graph mining[C]//Proceedings of the 2004 SIAM International Conference on Data Mining.Society for Industrial and Applied Mathematics,2004:442-446.
[30]BAGAN G,BONIFATI A,CIUCANU R,et al.gMark:Schema-driven generation of graphs and queries[J].IEEE Transactions on Knowledge and Data Engineering,2016,29(4):856-869.
[31]LESKOVEC J,CHAKRABARTI D,KLEINBERG J,et al.Kronecker graphs:an approach to modeling networks[J].Journal of Machine Learning Research,2010,11(2):985-1042.
[32]HADIAN A,NOBARI S,MINAEI-BIDGOLI B,et al.Roll:Fast in-memory generation of gigantic scale-free networks[C]//Proceedings of the 2016 International Conference on Management of Data.2016:1829-1842.
[33]PARK H,KIM M S.TrillionG:A trillion-scale synthetic graph generator using a recursive vector model[C]//Proceedings of the 2017 ACM International Conference on Management of Data.2017:913-928.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!