计算机科学 ›› 2025, Vol. 52 ›› Issue (12): 115-124.doi: 10.11896/jsjkx.241100090

• 数据库&大数据&数据科学 • 上一篇    下一篇

PIEnum:高效的概率图上路径枚举算法

谢文林, 杜明, 周军锋   

  1. 东华大学计算机科学与技术学院 上海 201620
  • 收稿日期:2024-11-24 修回日期:2025-02-15 出版日期:2025-12-15 发布日期:2025-12-09
  • 通讯作者: 杜明(duming@dhu.edu.cn)
  • 作者简介:(2222702@mail.dhu.edu.cn)
  • 基金资助:
    国家自然科学基金(62372101,61873337,62272097)

PIEnum:Efficient Algorithm of Path Enumeration on Large Uncertain Graphs

XIE Wenlin, DU Ming, ZHOU Junfeng   

  1. School of Computer Science and Technology, Donghua University, Shanghai 201620, China
  • Received:2024-11-24 Revised:2025-02-15 Published:2025-12-15 Online:2025-12-09
  • About author:XIE Wenlin,born in 1997,postgra-duate.His main research interest is query processing on large scale graph data and optimization.
    DU Ming,born in 1975,Ph.D,professor.His main research interests include natural language processing and information analysis.
  • Supported by:
    This work was supported by the National Natural Science Foundation of China(62372101,61873337,62272097).

摘要: 枚举概率图上两个顶点间的路径是分析两点间关系的基本手段。为了解决已有算法存在的剪枝不充分、冗余计算等问题,提出了一种基于剪枝和索引的算法——PIEnum,其任务是在概率图G上枚举所有从起点s到终点t,长度不超过k且路径上所有边的概率累积值不低于γ的简单路径,其中kγ分别为给定的路径长度约束值和概率阈值。对于一个查询,PIEnum首先剔除无效顶点以缩减路径枚举的搜索空间,然后构建一个轻量级的在线索引来避免路径枚举过程中重复的剪枝判断,最后在路径枚举的过程中将无效的搜索分支剪枝。为了进一步提升算法在稠密图上的查询效率,基于Join模式实现了PIEnum+。在10个真实数据集上检验了该算法的性能,实验结果表明,PIEnum整体性能比已有算法提升了10倍以上。

关键词: 图, 概率图, 路径枚举, 剪枝, 索引

Abstract: A basic approach of investigating the relationship between two vertices on the uncertain graph is to enumerate all paths between them.To solve the problem of inadequate pruning and redundant computation in the state-of-the-art algorithms,this paper proposes an pruning and index based algorithm,PIEnum,whose target is to enumerate simple paths from a source vertex s to a target vertex t where the length of each path is no more than a given hop constraint k and the probability of each path is no less than a given probability threshold γ.For an input query,it firstly excludes the unpromising vertices to reduce the search space.Then it builds an online light-weight index to avoid repeated pruning examinations during the enumeration.Finally,it develops an efficient approach to prune invalid search branches during the enumeration.To further improve the performance on dense graphs,it implements PIEnum+ based on the Join paradigm.The comprehensive experimental results on 10 real-life graphs show that PIEnum improves the overall performance by at least 10 times compared to the state-of-the-art algorithms.

Key words: Graph, Uncertain graph, Path enumeration, Pruning, Index

中图分类号: 

  • TP301
[1]LIU H,JIN C,YANG B,et al.Finding top-k shortest paths with diversity[C]//ICDE.IEEE Computer Society,2018:1761-1762.
[2]OUYANG D,YUAN L,QIN L,et al.Efficient shortest path index maintenance on dynamic road networks with theoretical guarantees[J].VLDB Endow,2020,13(5):602-615.
[3]PENG Y,YU J X,WANG S.PSPC:efficient parallel shortest path counting on large-scale graphs[C]//ICDE.IEEE,2023:896-908.
[4]LAI L,QING Z,YANG Z,et al.Distributed subgraph matching on timely dataflow[J].VLDB Endow,2019.12(10):1099-1112.
[5]WANG K,LIN X,QIN L,et al.Vertex priority based butterfly counting for large-scale bipartite networks[J].VLDB Endow,2019,12(10):1139-1152.
[6]KAPUSTA J,MUNK M,SVEC P.Selection of suitable page-rank calculation for analysis of differences between expected and observed probability of accesses to web page[C]//MIWAI.Springer,2018:139-150.
[7]LIBEN D,KLEINBERG J M.The link-prediction problem forsocial networks[J].JASIST,2007,58(7):1019-1031.
[8]YAN J,WANG W,WU Z,et al.An uncertain graph method based on shuffle model to preserve link privacy of mobile social networks[J].JISE,2024,40(1):125-150.
[9]SEVON P,ERONEN L,HINTSANEN P,et al.Link discovery in graphs derived from biological databases[C]//DILS.Sprin-ger,2006:35-49.
[10]KIMURA M,SAITO K.Tractable models for information diffusion in social networks[C]//PKDD.Berlin:Springer,2006:259-271.
[11]SAHA A,BROKKELKAMP R,VELAJ Y,et al.Shortest paths and centrality in uncertain networks[J].VLDB Endow,2021,14(7):1188-1201.
[12]PENG Y,ZHANG Y,ZHANG W,et al.Efficient probabilistic k-core computation on uncertain graphs[C]//ICDE.IEEE Computer Society,2018:1192-1203.
[13]ZHOU A,WANG Y,CHEN L.Butterfly counting and bitruss decomposition on uncertain bipartite graphs[J].VLDB J,2023,32(5):1013-1036.
[14]KE X,KHAN A,QUAN L H.An in-depth comparison of s-t reliability algorithms over uncertain graphs[J].VLDB Endow,2019,12(8):864-876.
[15]DAI Q,LI R,LIAO M,CHEN H,et al.Fast maximal cliqueenumeration on uncertain graphs:A pivot-based approach[C]//SIGMOD.ACM,2022:2034-2047.
[16]YASUDA N,SUGAYA T,MINATO S.Fast compilation of s-t paths on a graph for counting and enumeration[C]//AMBN.PMLR,2017:129-140.
[17]BIRMELE E,FERREIRA R A,GROSSI R,et al.Optimal lis-ting of cycles and st-paths in undirected graphs[C]//SODA.SIAM,2013:1884-1896.
[18]NISHINO M,YASUDA N,MINATO S,et al.Compiling graph substructures into sentential decision diagrams[C]//AAAI.Springer,2017:1213-1221.
[19]GROSSI R,MARINO A,VERSARI L.Efficient algorithms for listing k disjoint st-paths in graphs[C]//LATIN.Springer,2018:544-557.
[20]RIZZI R,SACOMOTO G,SAGOT M.Efficiently listing boun-ded length st-paths[C]//IWOCA.Springer,2014:318-329.
[21]PENG Y,ZHANG Y,LIN X,et al.Hop-constrained s-t simple path enumeration:Towards bridging theory and practice[J].VLDB Endow,2019,13(4):463-476.
[22]QIU X,CEN W,QIAN Z,et al.Realtime constrained cycle detection in large dynamic graphs[J].VLDB Endow,2018,11(12):1876-1888.
[23]LAI Z,PENG Y,YANG S,et al.PEFP:efficient k hop constrained s-t simple path enumeration on FPGA[C]//ICDE.IEEE,2021:1320-1331.
[24]HAO K,YUAN L,ZHANG W.Distributed hop-constrained s-t simple path enumeration at billion scale[J].VLDB Endow,2021,15(2):169-182.
[25]SUN S,CHEN Y,HE B,et al.Pathenum:Towards real-time hop constrained s-t path enumeration[C]//SIGMOD.ACM,2021:1758-1770.
[26]ZHANG J,YANG S,OUYANG D,et al.Hop constrained s-t simple path enumeration on large dynamic graphs[C]//ICDE.IEEE,2023:762-775.
[27]LI X,HAO K,YANG Z,et al.Hop-constrained s-t simple path enumeration in large uncertain graphs[C]//ADC.Springer,2022:115-127.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!